A Sufficient-Statistic Reduction of the Information Bottleneck to a Low-Dimensional Problem

arXiv stat.ML / 4/30/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proves that when the conditional distribution p(C|T) depends on T only through a sufficient statistic ϕ(T), the Information Bottleneck (IB) problem for (T,C) is exactly equivalent to the IB problem for (ϕ(T),C).
  • This reduction is loss-free: it preserves the entire IB curve, the optimum for every Lagrange trade-off parameter η, and the optimal representations up to pulling back through ϕ.
  • The authors show that the computational complexity of solving IB is determined by the dimension of the sufficient statistic rather than the dimension of the original source variable T.
  • They connect the result to known regimes by deriving the classical Gaussian IB solution as an immediate corollary and proposing a nonlinear-Gaussian generalization.
  • A small numerical example demonstrates the practical benefit: with an available low-dimensional sufficient statistic, the full exact IB curve can be computed using the reduced problem at a cost tied to the statistic’s dimension.

Abstract

We show that if the conditional distribution p(C | T) factors through a sufficient statistic {\phi}(T), then the Information Bottleneck (IB) problem for (T, C) is exactly equivalent to the IB problem for ({\phi}(T), C). The reduction is loss-free: it preserves the full IB curve, the Lagrangian optimum at every trade-off parameter \b{eta}, and the optimal representations up to pullback through {\phi}. As a result, the computational complexity of solving the IB problem is governed by the dimension of the sufficient statistic rather than the ambient dimension of the source. This identifies an exact structural condition under which the generic IB problem becomes tractable, and gives a formal bridge between the discrete and linear-Gaussian regimes. We then show that the classical Gaussian IB solution of Chechik, Globerson, Tishby and Weiss is an immediate corollary of this reduction, and we state a nonlinear-Gaussian generalisation. A small numerical example illustrates the practical consequence: when a low-dimensional sufficient statistic is available, the exact IB curve can be computed on the reduced problem at a cost determined by the statistic rather than by the ambient source dimension.