Ensemble-Based Dirichlet Modeling for Predictive Uncertainty and Selective Classification

arXiv stat.ML / 4/8/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • Cross-entropy-trained neural classifiers are accurate but do not directly provide reliable predictive uncertainty, and softmax scores for the correct class can vary across independent training runs.
  • The paper proposes an ensemble-based Dirichlet parameter estimation method that uses a method-of-moments estimator (optionally followed by a maximum-likelihood refinement) to produce explicit Dirichlet predictive distributions.
  • By deriving uncertainty from ensembles of softmax outputs, the approach avoids the sensitivity of Evidential Deep Learning to evidential loss design choices such as loss formulation, priors, and activations.
  • Experiments across multiple datasets indicate that the ensemble-derived Dirichlet uncertainty is more stable and improves uncertainty-guided downstream tasks.
  • The authors demonstrate better performance in applications like prediction confidence scoring and selective classification, where uncertainty estimates drive decision-making.

Abstract

Neural network classifiers trained with cross-entropy loss achieve strong predictive accuracy but lack the capability to provide inherent predictive uncertainty estimates, thus requiring external techniques to obtain these estimates. In addition, softmax scores for the true class can vary substantially across independent training runs, which limits the reliability of uncertainty-based decisions in downstream tasks. Evidential Deep Learning aims to address these limitations by producing uncertainty estimates in a single pass, but evidential training is highly sensitive to design choices including loss formulation, prior regularization, and activation functions. Therefore, this work introduces an alternative Dirichlet parameter estimation strategy by applying a method of moments estimator to ensembles of softmax outputs, with an optional maximum-likelihood refinement step. This ensemble-based construction decouples uncertainty estimation from the fragile evidential loss design while also mitigating the variability of single-run cross-entropy training, producing explicit Dirichlet predictive distributions. Across multiple datasets, we show that the improved stability and predictive uncertainty behavior of these ensemble-derived Dirichlet estimates translate into stronger performance in downstream uncertainty-guided applications such as prediction confidence scoring and selective classification.