Feedback Adaptation for Retrieval-Augmented Generation

arXiv cs.CL / 4/9/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that RAG evaluation should account for how systems change after receiving user/expert corrective feedback, rather than only measuring accuracy under static conditions.
  • It introduces “feedback adaptation” for RAG, proposing two metrics—correction lag (how fast behavior updates after feedback) and post-feedback performance (reliability on semantically related future queries).
  • Experiments indicate a trade-off for training-based methods, where faster or more reliable adaptation can come at the cost of delayed correction.
  • The authors propose PatchRAG, an inference-time (no-retraining) method intended to apply feedback immediately while maintaining strong generalization to related queries under their metrics.
  • Overall, the work reframes interactive RAG behavior as a measurable dimension and highlights that current evaluation protocols overlook feedback propagation dynamics.

Abstract

Retrieval-Augmented Generation (RAG) systems are typically evaluated under static assumptions, despite being frequently corrected through user or expert feedback in deployment. Existing evaluation protocols focus on overall accuracy and fail to capture how systems adapt after feedback is introduced. We introduce feedback adaptation as a problem setting for RAG systems, which asks how effectively and how quickly corrective feedback propagates to future queries. To make this behavior measurable, we propose two evaluation axes: correction lag, which captures the delay between feedback provision and behavioral change, and post-feedback performance, which measures reliability on semantically related queries after feedback. Using these metrics, we show that training-based approaches exhibit a trade-off between delayed correction and reliable adaptation. We further propose PatchRAG, a minimal inference-time instantiation that incorporates feedback without retraining, demonstrating immediate correction and strong post-feedback generalization under the proposed evaluation. Our results highlight feedback adaptation as a previously overlooked dimension of RAG system behavior in interactive settings.