ERA: Evidence-based Reliability Alignment for Honest Retrieval-Augmented Generation

arXiv cs.AI / 4/25/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces ERA (Evidence-based Reliability Alignment), a framework for improving reliability and abstention behavior in Retrieval-Augmented Generation (RAG) systems when internal model knowledge conflicts with retrieved evidence.
  • It replaces scalar confidence estimation with explicit evidence distributions by modeling internal and retrieved knowledge as independent belief masses using the Dirichlet distribution.
  • To measure and leverage conflicts between information sources, ERA applies Dempster–Shafer Theory (DST) to quantify the geometric disagreement between sources.
  • The method separates epistemic uncertainty from aleatoric (data) ambiguity and adjusts the optimization objective based on detected knowledge conflict.
  • Experiments on standard benchmarks and a curated generalization dataset show ERA outperforms existing baselines, achieving better calibration and a improved coverage–abstention trade-off.

Abstract

Retrieval-Augmented Generation (RAG) grounds language models in factual evidence but introduces critical challenges regarding knowledge conflicts between internalized parameters and retrieved information. However, existing reliability methods, typically relying on scalar confidence, fail to explicitly distinguish between epistemic uncertainty and inherent data ambiguity in such hybrid scenarios. In this paper, we propose a new framework called ERA (Evidence-based Reliability Alignment) to enhance abstention behavior in RAG systems by shifting confidence estimation from scalar probabilities to explicit evidence distributions. Our method consists of two main components: (1) Contextual Evidence Quantification, which models internal and external knowledge as independent belief masses via the Dirichlet distribution, and (2) Quantifying Knowledge Conflict, which leverages Dempster-Shafer Theory (DST) to rigorously measure the geometric discordance between information sources. These components are used to disentangle epistemic uncertainty from aleatoric uncertainty and modulate the optimization objective based on detected conflicts. Experiments on standard benchmarks and a curated generalization dataset demonstrate that our approach significantly outperforms baselines, optimizing the trade-off between answer coverage and abstention with superior calibration.