Optimal Stability of KL Divergence under Gaussian Perturbations

arXiv cs.LG / 4/14/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • 研究は、KLダイバージェンスがガウス分布への摂動(ガウス族間の置換)に対してどれだけ安定かを、ガウス族以外の分布も含めて定式化・評価する。
  • 分布Pが有限の2次モーメントを持ち、KL(P||N1)が大きく、かつKL(N1||N2)が高々εのとき、KL(P||N2) は KL(P||N1) から O(√ε) だけしか減らない下界を示す。
  • この √ε の収束率は一般に最適であり、さらにガウス族内部でも同等の最良性が成り立つことを証明している。
  • 既存研究が「関係する分布がすべてガウス」という強い仮定に依存していた点を外し、KLの非対称性や一般空間での三角不等式不在といった難しさを克服した。
  • 応用として、flow-based generative models を用いた OOD 検出における KL ベース解析の理論的基盤を与え、従来の強いガウス仮定を緩和する。

Abstract

We study the problem of characterizing the stability of Kullback-Leibler (KL) divergence under Gaussian perturbations beyond Gaussian families. Existing relaxed triangle inequalities for KL divergence critically rely on the assumption that all involved distributions are Gaussian, which limits their applicability in modern applications such as out-of-distribution (OOD) detection with flow-based generative models. In this paper, we remove this restriction by establishing a sharp stability bound between an arbitrary distribution and Gaussian families under mild moment conditions. Specifically, let P be a distribution with finite second moment, and let \mathcal{N}_1 and \mathcal{N}_2 be multivariate Gaussian distributions. We show that if KL(P||\mathcal{N}_1) is large and KL(\mathcal{N}_1||\mathcal{N}_2) is at most \epsilon, then KL(P||\mathcal{N}_2) \ge KL(P||\mathcal{N}_1) - O(\sqrt{\epsilon}). Moreover, we prove that this \sqrt{\epsilon} rate is optimal in general, even within the Gaussian family. This result reveals an intrinsic stability property of KL divergence under Gaussian perturbations, extending classical Gaussian-only relaxed triangle inequalities to general distributions. The result is non-trivial due to the asymmetry of KL divergence and the absence of a triangle inequality in general probability spaces. As an application, we provide a rigorous foundation for KL-based OOD analysis in flow-based models, removing strong Gaussian assumptions used in prior work. More broadly, our result enables KL-based reasoning in non-Gaussian settings arising in deep learning and reinforcement learning.