ResRank: Unifying Retrieval and Listwise Reranking via End-to-End Joint Training with Residual Passage Compression
arXiv cs.AI / 4/27/2026
💬 OpinionDeveloper Stack & InfrastructureModels & Research
Key Points
- ResRank is an end-to-end unified retrieval + listwise reranking framework designed to overcome LLM reranking bottlenecks like the “lost in the middle” effect and super-linear inference latency from long passages.
- It uses an Encoder-LLM to compress each candidate passage into a single embedding, which is then combined with the query and passed to a Reranker-LLM for listwise ranking.
- A residual connection design is added to reduce misalignment between the compressed embedding space and the ranking space by merging encoder embeddings with reranker contextual hidden states.
- ResRank avoids autoregressive generation by using a one-step cosine-similarity-based scoring mechanism, requiring zero generated tokens and only one token per passage, while being trained with a dual-stage, multi-task joint optimization strategy.
- Experiments on TREC Deep Learning and eight BEIR datasets show ResRank is competitive or better than existing methods while improving the effectiveness/efficiency trade-off substantially.




