Reducing Redundancy in Retrieval-Augmented Generation through Chunk Filtering

arXiv cs.CL / 4/28/2026

📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • Standard RAG chunking often introduces redundant chunks that inflate storage costs and slow down retrieval operations.
  • The study evaluates lightweight chunk filtering approaches—semantic, topic-based, and named-entity-based—to shrink the indexed corpus while preserving retrieval quality.
  • Experiments across multiple corpora use precision, recall, and intersection-over-union (IoU) token-based evaluation to measure retrieval performance.
  • Results show named-entity-based filtering can cut vector index size by roughly 25% to 36% while keeping retrieval quality close to a baseline.
  • The findings indicate that redundancy from chunking can be reduced effectively, improving the efficiency of retrieval components in RAG pipelines.

Abstract

Standard Retrieval-Augmented Generation (RAG) chunking methods often create excessive redundancy, increasing storage costs and slowing retrieval. This study explores chunk filtering strategies, such as semantic, topic-based, and named-entity-based methods in order to reduce the indexed corpus while preserving retrieval quality. Experiments are conducted on multiple corpora. Retrieval performance is evaluated using a token-based framework based on precision, recall, and intersection-over-union metrics. Results indicate that entity-based filtering can reduce vector index size by approximately 25% to 36% while maintaining high retrieval quality close to the baseline. These findings suggest that redundancy introduced during chunking can be effectively reduced through lightweight filtering, improving the efficiency of retrieval-oriented components in RAG pipelines.