Mitigating Hallucination on Hallucination in RAG via Ensemble Voting

arXiv cs.CL / 3/31/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • Retrieval-Augmented Generation (RAG) can still produce “hallucination on hallucination” when flawed retrieval misleads the LLM, compounding errors during response generation.
  • The paper introduces VOTE-RAG, a training-free, two-stage ensemble-voting framework that first aggregates documents via retrieval voting and then selects the final answer via response voting.
  • Retrieval voting uses multiple parallel query-generating agents to diversify queries and pool retrieved documents, aiming to reduce the impact of any single bad retrieval.
  • Response voting has multiple agents independently generate answers from the aggregated documents and uses majority vote to improve robustness and reliability.
  • Experiments on six benchmark datasets indicate VOTE-RAG matches or outperforms more complex methods while remaining simpler, fully parallelizable, and avoiding “problem drift” risk.

Abstract

Retrieval-Augmented Generation (RAG) aims to reduce hallucinations in Large Language Models (LLMs) by integrating external knowledge. However, RAG introduces a critical challenge: hallucination on hallucination," where flawed retrieval results mislead the generation model, leading to compounded hallucinations. To address this issue, we propose VOTE-RAG, a novel, training-free framework with a two-stage structure and efficient, parallelizable voting mechanisms. VOTE-RAG includes: (1) Retrieval Voting, where multiple agents generate diverse queries in parallel and aggregate all retrieved documents; (2) Response Voting, where multiple agents independently generate answers based on the aggregated documents, with the final output determined by majority vote. We conduct comparative experiments on six benchmark datasets. Our results show that VOTE-RAG achieves performance comparable to or surpassing more complex frameworks. Additionally, VOTE-RAG features a simpler architecture, is fully parallelizable, and avoids the problem drift" risk. Our work demonstrates that simple, reliable ensemble voting is a superior and more efficient method for mitigating RAG hallucinations.