Document Optimization for Black-Box Retrieval via Reinforcement Learning
arXiv cs.CL / 4/8/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper reframes “document expansion” as a “document optimization” problem to improve retrieval quality without increasing query-time computation by learning document transformations offline.
- It fine-tunes a language model or vision-language model using GRPO, leveraging only black-box access to a target retriever’s ranking outputs as reward signals.
- The method is designed to work across diverse retriever types, including single-vector, multi-vector, and lexical retrievers, rather than being tied to one architecture.
- Experiments on code retrieval and visual document retrieval (VDR) show consistent retrieval gains, including cases where smaller retrievers improved enough to outperform larger ones.
- When retriever weights are available, the learned document optimization can match or complement retriever fine-tuning, with the best results coming from combining both approaches in multiple settings.
Related Articles

Black Hat Asia
AI Business
[N] Just found out that Milla Jovovich is a dev, invested in AI, and just open sourced a project
Reddit r/MachineLearning

ALTK‑Evolve: On‑the‑Job Learning for AI Agents
Hugging Face Blog

Context Windows Are Getting Absurd — And That's a Good Thing
Dev.to

Every AI Agent Registry in 2026, Compared
Dev.to