PAR$^2$-RAG: Planned Active Retrieval and Reasoning for Multi-Hop Question Answering

arXiv cs.AI / 4/1/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces PAR$^2$-RAG, a two-stage RAG framework designed to improve multi-hop question answering by separating “coverage” from “commitment.”
It uses breadth-first anchoring to construct a high-recall evidence frontier, then performs depth-first iterative refinement with evidence sufficiency control to reduce error amplification.
PAR$^2$-RAG is evaluated on four multi-hop QA benchmarks and consistently beats prior state-of-the-art baselines.
Compared with the IRCoT baseline, PAR$^2$-RAG reaches up to 23.5% higher accuracy and up to 10.5% retrieval improvement in NDCG.
The work targets key failure modes of prior approaches: getting stuck on early low-recall retrieval trajectories and producing non-adaptive static query plans.

Abstract

Large language models (LLMs) remain brittle on multi-hop question answering (MHQA), where answering requires combining evidence across documents through retrieval and reasoning. Iterative retrieval systems can fail by locking onto an early low-recall trajectory and amplifying downstream errors, while planning-only approaches may produce static query sets that cannot adapt when intermediate evidence changes. We propose \textbf{Planned Active Retrieval and Reasoning RAG (PAR

^2

-RAG)}, a two-stage framework that separates \emph{coverage} from \emph{commitment}. PAR

^2

-RAG first performs breadth-first anchoring to build a high-recall evidence frontier, then applies depth-first refinement with evidence sufficiency control in an iterative loop. Across four MHQA benchmarks, PAR

^2

-RAG consistently outperforms existing state-of-the-art baselines, compared with IRCoT, PAR

^2

-RAG achieves up to \textbf{23.5\%} higher accuracy, with retrieval gains of up to \textbf{10.5\%} in NDCG.

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/1DailyView insight →

Black Hat Asia

AI Business

Knowledge Governance For The Agentic Economy.

Dev.to

AI server farms heat up the neighborhood for miles around, paper finds

The Register

Paperclip: Công Cụ Miễn Phí Biến AI Thành Đội Phát Triển Phần Mềm

Dev.to

Does the Claude “leak” actually change anything in practice?

Reddit r/LocalLLaMA

PAR$^2$-RAG: Planned Active Retrieval and Reasoning for Multi-Hop Question Answering

Key Points

Abstract

💡 Insights using this article

Related Articles

Black Hat Asia

Knowledge Governance For The Agentic Economy.

AI server farms heat up the neighborhood for miles around, paper finds

Paperclip: Công Cụ Miễn Phí Biến AI Thành Đội Phát Triển Phần Mềm

Does the Claude “leak” actually change anything in practice?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer