Association Is Not Similarity: Learning Corpus-Specific Associations for Multi-Hop Retrieval
arXiv cs.CL / 4/24/2026
💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research
Key Points
- The paper proposes Association-Augmented Retrieval (AAR), which reranks dense retrieval candidates using learned, corpus-specific associative relationships rather than relying solely on embedding similarity.
- AAR uses a small 4.2M-parameter MLP trained with contrastive learning on co-occurrence annotations to score bidirectional associations between passages during inference.
- On HotpotQA, AAR raises passage Recall@5 from 0.831 to 0.916 (+8.6 points) without evaluation-set tuning, with the largest gains on hard questions (+28.5 points); it also improves MuSiQue by +10.1 points in the transductive setting.
- Experiments indicate the approach is not broadly transferable: an inductive variant trained on training-split associations shows no significant improvement on unseen validation associations, and ablations confirm that using true association pairs (not just semantic similarity) is critical.
- The method is lightweight and practical, adding about 3.7ms per query, training in under two minutes on a single GPU, and requiring no LLM-based indexing, while retrieval improvements translate to +6.4 exact match in downstream QA.
Related Articles

Black Hat USA
AI Business

The 67th Attempt: When Your "Knowledge Management" System Becomes a Self-Fulfilling Prophecy of Excellence
Dev.to

Context Engineering for Developers: A Practical Guide (2026)
Dev.to

GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.
Dev.to
AI Visibility Tracking Exploded in 2026: 6 Tools Every Brand Needs Now
Dev.to