Aligning Dense Retrievers with LLM Utility via DistillationAligning Dense Retrievers with LLM Utility via Distillation
arXiv cs.AI / 4/27/2026
💬 OpinionDeveloper Stack & InfrastructureModels & Research
Key Points
- The paper proposes UAE (Utility-Aligned Embeddings) to improve dense vector retrieval for RAG by aligning embedding similarity with an LLM’s retrieval utility signals.
- UAE trains a bi-encoder using a distribution-matching formulation and a Utility-Modulated InfoNCE objective that derives graded utility from perplexity reduction, avoiding any test-time LLM re-ranking.
- By injecting utility signals directly into the embedding space, UAE aims to overcome precision limitations of pure similarity search while sidestepping the computational cost and noise of LLM-based re-ranking.
- Experiments on the QASPER benchmark show large gains over a strong semantic baseline (BGE-Base), including +30.59% Recall@1, +30.16% MAP, and +17.3% Token F1.
- UAE is reported to be over 180× faster than efficient LLM re-ranking methods while maintaining competitive quality, enabling scalable retrieval for RAG systems.

