AtomicRAG: Atom-Entity Graphs for Retrieval-Augmented Generation

arXiv cs.AI / 4/25/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The paper argues that existing GraphRAG systems often treat text chunks as fixed knowledge units, which reduces flexibility across different retrieval scenarios.
It proposes AtomicRAG, representing knowledge as fine-grained “knowledge atoms” (self-contained factual units) rather than coarse text chunks.
The approach builds Atom-Entity Graphs where edges indicate whether a relationship exists, reducing reliance on potentially error-prone triple-based entity linking.
It combines personalized PageRank with relevance-based filtering to improve entity connectivity and the reliability of reasoning paths.
Experiments and theoretical analysis on five public benchmarks show AtomicRAG improves retrieval accuracy and reasoning robustness compared with strong RAG baselines.

Abstract

Recent GraphRAG methods integrate graph structures into text indexing and retrieval, using knowledge graph triples to connect text chunks, thereby improving retrieval coverage and precision. However, we observe that treating text chunks as the basic unit of knowledge representation rigidly groups multiple atomic facts together, limiting the flexibility and adaptability needed to support diverse retrieval scenarios. Additionally, triple-based entity linking is sensitive to relation-extraction errors, which can lead to missing or incorrect reasoning paths and ultimately hurt retrieval accuracy. To address these issues, we propose the Atom-Entity Graph, a more precise and reliable architecture for knowledge representation and indexing. In our approach, knowledge is stored as knowledge atoms, namely individual, self-contained units of factual information, rather than coarse-grained text chunks. This allows knowledge elements to be flexibly reassembled without mutual interference, thereby enabling seamless alignment with diverse query perspectives. Edges between entities simply indicate whether a relationship exists. By combining personalized PageRank with relevance-based filtering, we maintain accurate entity connections and improve the reliability of reasoning. Theoretical analysis and experiments on five public benchmarks show that the proposed AtomicRAG algorithm outperforms strong RAG baselines in retrieval accuracy and reasoning robustness. Code: https://github.com/7HHHHH/AtomicRAG.