GESS: Multi-cue Guided Local Feature Learning via Geometric and Semantic Synergy
arXiv cs.CV / 4/8/2026
📰 NewsModels & Research
Key Points
- The paper introduces GESS, a multi-cue guided framework for improving computer-vision local feature detection and description by jointly leveraging semantic and geometric cues.
- It uses two lightweight prediction heads—one for semantic-normal coupling via a shared 3D vector field and another for depth stability via geometric consistency—to reduce optimization interference and improve keypoint reliability.
- A Semantic-Depth Aware Keypoint (SDAK) mechanism reweights keypoint responses using semantic reliability and depth stability to suppress spurious features in unreliable regions.
- For descriptors, it proposes a Unified Triple-Cue Fusion (UTCF) module with a semantic-scheduled gating strategy to adaptively inject multi-attribute information and enhance discriminability.
- Experiments across four benchmarks report improved robustness and descriptor quality, and the authors indicate code and pretrained models will be released on GitHub.
Related Articles
[N] Just found out that Milla Jovovich is a dev, invested in AI, and just open sourced a project
Reddit r/MachineLearning

Context Windows Are Getting Absurd — And That's a Good Thing
Dev.to

GitHub Weekly: Copilot SDK Goes Public, Cloud Agent Breaks Free
Dev.to

kepler-452b. GGUF when?
Reddit r/LocalLLaMA
[Tool] Quick hack to recover Qwen3.5 MTP after fine-tuning for faster inference speed (Transformers)
Reddit r/LocalLLaMA