Uniform Laws of Large Numbers in Product Spaces

arXiv cs.LG / 3/26/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies uniform laws of large numbers in Cartesian product spaces, extending VC-theory-style uniform convergence beyond standard settings.
  • It shows that, assuming the joint distribution is absolutely continuous with respect to the product of its marginals, a uniform law of large numbers holds for a family of events exactly when the family has finite linear VC dimension.
  • The linear VC dimension is defined via shattering along axis-parallel lines (vectors that differ in at most one coordinate), and it is always bounded by the classical VC dimension while being potentially much smaller.
  • A key example highlights the gap: convex sets have linear VC dimension 2, even though their classical VC dimension becomes infinite for dimension d≥2.
  • The authors develop estimators that differ substantially from the standard empirical mean, argue such deviations are unavoidable, and emphasize open problems including quantitative sample-complexity bounds.

Abstract

Uniform laws of large numbers form a cornerstone of Vapnik--Chervonenkis theory, where they are characterized by the finiteness of the VC dimension. In this work, we study uniform convergence phenomena in cartesian product spaces, under assumptions on the underlying distribution that are compatible with the product structure. Specifically, we assume that the distribution is absolutely continuous with respect to the product of its marginals, a condition that captures many natural settings, including product distributions, sparse mixtures of product distributions, distributions with low mutual information, and more. We show that, under this assumption, a uniform law of large numbers holds for a family of events if and only if the linear VC dimension of the family is finite. The linear VC dimension is defined as the maximum size of a shattered set that lies on an axis-parallel line, namely, a set of vectors that agree on all but at most one coordinate. This dimension is always at most the classical VC dimension, yet it can be arbitrarily smaller. For instance, the family of convex sets in \mathbb{R}^d has linear VC dimension 2, while its VC dimension is infinite already for d\ge 2. Our proofs rely on estimator that departs substantially from the standard empirical mean estimator and exhibits more intricate structure. We show that such deviations from the standard empirical mean estimator are unavoidable in this setting. Throughout the paper, we propose several open questions, with a particular focus on quantitative sample complexity bounds.