Credal Concept Bottleneck Models for Epistemic-Aleatoric Uncertainty Decomposition

arXiv cs.AI / 4/28/2026

📰 NewsModels & Research

Key Points

  • The paper introduces CREDENCE, a new Concept Bottleneck Model (CBM) framework that decomposes concept-level uncertainty into epistemic (reducible) and aleatoric (irreducible) components.
  • CREDENCE represents each concept as a credal prediction (a probability interval), enabling uncertainty estimation grounded in the model’s own probabilistic outputs.
  • Epistemic uncertainty is derived from disagreement across diverse concept heads, while aleatoric uncertainty is estimated using a dedicated ambiguity output trained to reflect annotator disagreement when available.
  • The approach is designed to be actionable, supporting decision policies such as automating low-uncertainty cases, collecting more data for high-epistemic cases, routing high-aleatoric cases to human review, and abstaining when both are high.
  • Experiments across multiple tasks show epistemic uncertainty correlates with prediction errors, while aleatoric uncertainty tracks annotator disagreement, providing signal beyond error-only relationships.

Abstract

Concept Bottleneck Models (CBMs) predict through human-interpretable concepts, but they typically output point concept probabilities that conflate epistemic uncertainty (reducible model underspecification) with aleatoric uncertainty (irreducible input ambiguity). This makes concept-level uncertainty hard to interpret and, more importantly, hard to act upon. We introduce CREDENCE (Credal Ensemble Concept Estimation), a CBM framework that decomposes concept uncertainty by construction. CREDENCE represents each concept as a credal prediction (a probability interval), derives epistemic uncertainty from disagreement across diverse concept heads, and estimates aleatoric uncertainty via a dedicated ambiguity output trained to match annotator disagreement when available. The resulting signals support prescriptive decisions: automate low-uncertainty cases, prioritize data collection for high-epistemic cases, route high-aleatoric cases to human review, and abstain when both are high. Across several tasks, we show that epistemic uncertainty is positively associated with prediction errors, whereas aleatoric uncertainty closely tracks annotator disagreement, providing guidance beyond error correlation. Our implementation is available at the following link: https://github.com/Tankiit/Credal_Sets/tree/ensemble-credal-cbm