Decomposing Probabilistic Scores: Reliability, Information Loss and Uncertainty

arXiv stat.ML / 3/24/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies calibration as a conditional property that depends on how much information a predictor retains, using decomposition identities tied to arbitrary proper scoring losses.
  • It shows that, at a given information level, the expected proper loss splits into a reliability (proper-regret) term and a conditional entropy term representing residual uncertainty.
  • For nested information levels, it provides a chain decomposition that quantifies how much information (and thus expected loss reduction) is gained when moving from one representation to a richer one.
  • In classification, the framework yields a three-term breakdown—miscalibration, a grouping term capturing information loss from features X to a score S, and irreducible uncertainty at the feature level.
  • The authors apply the identities to analyze post-hoc recalibration, aggregation of calibrated models, and stagewise/boosting approaches, with explicit results for Brier score and log loss.

Abstract

Calibration is a conditional property that depends on the information retained by a predictor. We develop decomposition identities for arbitrary proper losses that make this dependence explicit. At any information level \mathcal A, the expected loss of an \mathcal A-measurable predictor splits into a proper-regret (reliability) term and a conditional entropy (residual uncertainty) term. For nested levels \mathcal A\subseteq\mathcal B, a chain decomposition quantifies the information gain from \mathcal A to \mathcal B. Applied to classification with features \boldsymbol{X} and score S=s(\boldsymbol{X}), this yields a three-term identity: miscalibration, a {\em grouping} term measuring information loss from \boldsymbol{X} to S, and irreducible uncertainty at the feature level. We leverage the framework to analyze post-hoc recalibration, aggregation of calibrated models, and stagewise/boosting constructions, with explicit forms for Brier and log-loss.