Principled and Scalable Diversity-Aware Retrieval via Cardinality-Constrained Binary Quadratic Programming

arXiv cs.CL / 4/6/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses diversity-aware passage retrieval for RAG, noting that prior approaches struggle with theoretical guarantees and scalability as the retrieval set size k grows.
  • It formulates diversity retrieval as cardinality-constrained binary quadratic programming (CCBQP), introducing an interpretable parameter that explicitly trades off relevance versus semantic diversity.
  • The authors develop a non-convex tight continuous relaxation and a Frank–Wolfe-based algorithm, including landscape analysis and convergence guarantees.
  • Experiments show the proposed method improves over baselines across the relevance–diversity Pareto frontier and delivers substantial speedups.

Abstract

Diversity-aware retrieval is essential for Retrieval-Augmented Generation (RAG), yet existing methods lack theoretical guarantees and face scalability issues as the number of retrieved passages k increases. We propose a principled formulation of diversity retrieval as a cardinality-constrained binary quadratic programming (CCBQP), which explicitly balances relevance and semantic diversity through an interpretable trade-off parameter. Inspired by recent advances in combinatorial optimization, we develop a non-convex tight continuous relaxation and a Frank--Wolfe based algorithm with landscape analysis and convergence guarantees. Extensive experiments demonstrate that our method consistently dominates baselines on the relevance-diversity Pareto frontier, while achieving significant speedup.

Principled and Scalable Diversity-Aware Retrieval via Cardinality-Constrained Binary Quadratic Programming | AI Navigate