Mean Testing under Truncation beyond Gaussian

arXiv stat.ML / 5/5/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies the theoretical limits of high-dimensional mean testing when observations come from an unknown truncation set that can conceal up to an ε fraction of probability mass.
  • It derives an information-theoretic detectability threshold: if the signal strength α is below a bias term scaling as O(ν_{P,p} ε^{1-1/p}), then the null and alternative hypotheses become indistinguishable even with infinite data.
  • When α exceeds this threshold, the authors show a straightforward second-order test can achieve near-optimal sample complexity, scaling with the covariance magnitude and the gap (α − 4ν_{P,p}ε^{1-1/p}).
  • The work also identifies a “structural escape” under a directional median regularity condition, where truncation bias improves to O(ε), restoring the classical testing rate in an intermediate regime.
  • Overall, the results unify mean testing behavior across finite-moment, sub-Gaussian-like, and median-regular structural regimes under truncation.

Abstract

We characterize the fundamental limits of high-dimensional mean testing under arbitrary truncation, where samples are drawn from the conditional distribution P(\cdot \mid S) for an unknown truncation set S that may hide up to an \varepsilon-fraction of the probability mass. For distributions with p-th directional moments of magnitude at most u_{P,p}, truncation induces a bias of order O( u_{P,p}\varepsilon^{1-1/p}). This bias creates a sharp information-theoretic detectability floor: when the signal \alpha falls below this threshold, the null and alternative hypotheses are indistinguishable even with infinite data. Above this floor, we prove that a simple second-order test achieving near-optimal sample complexity n = O\!\left(\frac{\|\Sigma_P\|}{(\alpha-4 u_{P,p}\varepsilon^{1-1/p})^2}\sqrt{d}\right). We further identify a structural escape from this finite-moment bias barrier. Under a directional median regularity assumption, truncation bias improves to linear order O(\varepsilon). This reveals an intermediate regime in which estimation requires \Theta(d) samples for uniform recovery, while testing recovers the classical \Theta(\sqrt d) rate once truncation bias is eliminated. Together, our results provide a unified framework for mean testing under truncation, connecting finite-moment, sub-Gaussian, and median-regular structural regimes.