G-MIXER: Geodesic Mixup-based Implicit Semantic Expansion and Explicit Semantic Re-ranking for Zero-Shot Composed Image Retrieval
arXiv cs.CV / 4/17/2026
📰 NewsDeveloper Stack & InfrastructureModels & Research
Key Points
- The paper introduces G-MIXER, a training-free method for zero-shot Composed Image Retrieval (CIR) that must balance explicit query semantics with implicit semantics from image-text composition.
- Unlike prior approaches that mostly depend on MLLM-generated textual descriptions, G-MIXER uses geodesic mixup across multiple mixup ratios to expand composed query features and produce a more diverse candidate set.
- G-MIXER then re-ranks the generated candidates using explicit semantics obtained from Multimodal Large Language Models (MLLMs), improving both diversity and retrieval accuracy.
- The method achieves state-of-the-art results on multiple ZS-CIR benchmarks without additional training, and the authors provide code via a GitHub repository.
Related Articles
langchain-anthropic==1.4.1
LangChain Releases

Stop burning tokens on DOM noise: a Playwright MCP optimizer layer
Dev.to

Talk to Your Favorite Game Characters! Mantella Brings AI to Skyrim and Fallout 4 NPCs
Dev.to

OpenAI Codex Update Adds macOS Agent, Browser, Memory; 3M Weekly Users
Dev.to

How Data Science Is Used to Predict User BeReducing Human Error in Compliance With AI Technology havior
Dev.to