Towards Accurate and Calibrated Classification: Regularizing Cross-Entropy From A Generative Perspective

arXiv cs.LG / 4/9/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a core problem in deep learning classification: models achieve high accuracy but produce poorly calibrated (overconfident) probability estimates, often linked to overfitting of the negative log-likelihood (NLL).
  • It proposes Generative Cross-Entropy (GCE), a generative-perspective reformulation of cross-entropy that maximizes p(x|y) and adds a class-level confidence regularizer while remaining strictly proper under mild conditions.
  • Experiments on CIFAR-10/100, Tiny-ImageNet, and a medical imaging benchmark show GCE improves both accuracy and calibration compared with standard cross-entropy, with the gains particularly pronounced under long-tailed class distributions.
  • When combined with adaptive piecewise temperature scaling (ATS), GCE achieves calibration on par with focal-loss variants without the typical accuracy drop those methods can cause.

Abstract

Accurate classification requires not only high predictive accuracy but also well-calibrated confidence estimates. Yet, modern deep neural networks (DNNs) are often overconfident, primarily due to overfitting on the negative log-likelihood (NLL). While focal loss variants alleviate this issue, they typically reduce accuracy, revealing a persistent trade-off between calibration and predictive performance. Motivated by the complementary strengths of generative and discriminative classifiers, we propose Generative Cross-Entropy (GCE), which maximizes p(x|y) and is equivalent to cross-entropy augmented with a class-level confidence regularizer. Under mild conditions, GCE is strictly proper. Across CIFAR-10/100, Tiny-ImageNet, and a medical imaging benchmark, GCE improves both accuracy and calibration over cross-entropy, especially in the long-tailed scenario. Combined with adaptive piecewise temperature scaling (ATS), GCE attains calibration competitive with focal-loss variants without sacrificing accuracy.