Beyond Semantic Relevance: Counterfactual Risk Minimization for Robust Retrieval-Augmented Generation

arXiv cs.CL / 5/5/2026

📰 NewsModels & Research

Key Points

  • Most existing RAG systems optimize retrieval for semantic relevance, but this can fail in decision-making contexts where user queries contain cognitive biases.
  • The paper identifies a “Relevance-Robustness Gap,” where higher relevance can lead to retrieving sycophantic evidence that reinforces hallucinations.
  • It introduces CoRM-RAG, which minimizes counterfactual risk by aligning retrieval with decision safety rather than similarity.
  • Using causal intervention, the approach simulates cognitive biases via a Cognitive Perturbation Protocol during training and distills the result into a lightweight Evidence Critic for scoring.
  • Experiments on decision benchmarks show CoRM-RAG outperforms strong dense retrievers and LLM rerankers under adversarial perturbations, including improved risk-aware abstention through robustness scoring.

Abstract

Standard Retrieval-Augmented Generation (RAG) systems predominantly rely on semantic relevance as a proxy for utility. However, this assumption collapses in realistic decision-making scenarios where user queries are laden with cognitive biases, such as false premises or confirmation bias. In such cases, maximizing relevance paradoxically promotes the retrieval of sycophantic evidence that reinforces hallucinations, a critical failure we term the ``Relevance-Robustness Gap''. To bridge this gap, we propose CoRM-RAG (Counterfactual Risk Minimization for RAG), a framework that aligns retrieval with decision safety rather than mere similarity. Grounded in causal intervention, we introduce a Cognitive Perturbation Protocol to simulate user biases during training, which is then distilled into a lightweight Evidence Critic. This scoring module learns to identify documents that possess sufficient evidential strength to steer the model toward correctness despite adversarial query perturbations. Extensive experiments on decision-making benchmarks demonstrate that CoRM-RAG significantly outperforms strong dense retrievers and LLM-based rerankers in adversarial settings, while enabling effective risk-aware abstention through reliable robustness scoring. Our code is available at https://github.com/PeiYangLiu/CoRM-RAG.git.