Beyond Semantic Search: Towards Referential Anchoring in Composed Image Retrieval
arXiv cs.CV / 4/8/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that composed image retrieval (CIR) often over-optimizes for semantic similarity and therefore fails to consistently retrieve the exact user-specified instance across different contexts.
- It introduces Object-Anchored Composed Image Retrieval (OACIR), a stricter fine-grained retrieval task focused on instance-level consistency rather than broad semantics.
- To support OACIR research, the authors build OACIRR, a new large-scale, multi-domain benchmark with 160K+ query quadruples, four candidate galleries, and hard-negative instance distractors.
- The benchmark extends each compositional query with a bounding box that anchors the target object in the reference image, enabling precise instance preservation.
- For the task, the authors propose AdaFocal, using a context-aware attention modulator to emphasize the anchored instance region while balancing it with the surrounding compositional context, and report strong improvements over existing models.
Related Articles

Black Hat Asia
AI Business
[N] Just found out that Milla Jovovich is a dev, invested in AI, and just open sourced a project
Reddit r/MachineLearning

ALTK‑Evolve: On‑the‑Job Learning for AI Agents
Hugging Face Blog

Context Windows Are Getting Absurd — And That's a Good Thing
Dev.to

Every AI Agent Registry in 2026, Compared
Dev.to