Hierarchical Abstract Tree for Cross-Document Retrieval-Augmented Generation

arXiv cs.AI / 5/4/2026

📰 NewsModels & Research

Key Points

  • The paper introduces Ψ-RAG, a hierarchical Tree-RAG framework designed specifically for cross-document multi-hop question answering rather than only single-document retrieval.
  • Ψ-RAG addresses key scaling issues in existing Tree-RAG approaches by using an iterative “merging and collapse” procedure to build a hierarchical abstract tree that adapts to data distributions.
  • It also adds a multi-granular retrieval agent that reorganizes queries and uses an agent-powered hybrid retriever to better connect and search across documents.
  • Experiments on cross-document multi-hop QA benchmarks show that Ψ-RAG outperforms RAPTOR by 25.9% and HippoRAG 2 by 7.4% in average F1 score, with code released on GitHub.

Abstract

Retrieval-augmented generation (RAG) enhances large language models with external knowledge, and tree-based RAG organizes documents into hierarchical indexes to support queries at multiple granularities. However, existing Tree-RAG methods designed for single-document retrieval face critical challenges in scaling to cross-document multi-hop questions: (1) poor distribution adaptability, where k-means clustering introduces noise due to rigid distribution assumptions; (2) structural isolation, as tree indexes lack explicit cross-document connections; and (3) coarse abstraction, which obscures fine-grained details. To address these limitations, we propose \Psi-RAG, a tree-RAG framework with two key components. First, a hierarchical abstract tree index built through an iterative "merging and collapse" process that adapts to data distributions without a priori assumption. Second, a multi-granular retrieval agent that intelligently interacts with the knowledge base with reorganized queries and an agent-powered hybrid retriever. \Psi-RAG supports diverse tasks from token-level question answering to document-level summarization. On cross-document multi-hop QA benchmarks, it outperforms RAPTOR by 25.9% and HippoRAG 2 by 7.4% in average F1 score. Code is available at https://github.com/Newiz430/Psi-RAG.