On the Conditioning Consistency Gap in Conditional Neural Processes

arXiv cs.LG / 4/22/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

Neural Processes (NPs) are meta-learning models that produce predictive distributions from context sets, but they generally do not satisfy Kolmogorov consistency, meaning they are not strictly a valid stochastic process.
The paper introduces the “conditioning consistency gap,” defined via a KL divergence, to quantify how CNP predictions change when adding an extra context point versus conditioning on it.
It proves that for conditional neural processes with bounded encoders and Lipschitz decoders, the consistency gap shrinks as O(1/n^2) with context size n, and that this convergence rate is tight.
The results imply the inconsistency is typically negligible for moderate context sizes, but may become significant in few-shot settings where n is small.
Overall, the study provides a precise theoretical measure of how well CNPs approximate valid stochastic processes, bridging a known practical observation and the missing quantification.

Abstract

Neural processes are meta-learning models that map context sets to predictive distributions. While inspired by stochastic processes, NPs do not generally satisfy the Kolmogorov consistency conditions required to define a valid stochastic process. This inconsistency is widely acknowledged but poorly understood. Practitioners note that NPs work well despite the violation, without quantifying what this means. We address this gap by defining the conditioning consistency gap, a KL divergence measuring how much a conditional neural process's (CNP) predictions change when a point is added to the context versus conditioned upon. Our main results show that for CNPs with bounded encoders and Lipschitz decoders, the consistency gap is

O(1/n^2)

in context size

n

, and that this rate is tight. These bounds establish the precise sense in which CNPs approximate valid stochastic processes. The inconsistency is negligible for moderate context sizes but can be significant in the few-shot regime.