CHOP: Chunkwise Context-Preserving Framework for RAG on Multi Documents
arXiv cs.CL / 4/20/2026
📰 NewsTools & Practical UsageModels & Research
Key Points
- The paper introduces CHOP, a chunking-and-reconstruction framework for RAG that aims to prevent retrieval accuracy from degrading when similar documents coexist in a vector database.
- CHOP uses an LLM-driven iterative process to assess chunk relevance and to rebuild document content by linking chunks to specific topics or query types.
- It proposes two core modules: CNM-Extractor, which creates compact per-chunk signatures (categories, key nouns, and model names), and a Continuity Decision Module, which maintains contextual coherence by deciding whether consecutive chunks belong to the same document flow.
- By prefixing each chunk with context-aware metadata, CHOP reduces semantic conflicts and improves retriever discrimination, leading to better ranking quality on benchmarks.
- The experiments report strong performance, including a Top-1 Hit Rate of 90.77%, indicating improved retrieval correctness and fewer confusion-driven errors.



