Decouple and Rectify: Semantics-Preserving Structural Enhancement for Open-Vocabulary Remote Sensing Segmentation
arXiv cs.CV / 4/3/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper addresses open-vocabulary remote sensing segmentation by noting that CLIP’s global, language-aligned visual features often underperform on fine structural boundary delineation.
- It proposes DR-Seg, a decouple-and-rectify framework that splits CLIP feature channels into semantics-dominated vs structure-dominated subspaces to apply DINO-based structural enhancement without harming language-aligned semantics.
- A prior-driven graph rectification module injects high-fidelity structural priors under DINO guidance to produce a refined branch for better spatial delineation.
- An uncertainty-guided adaptive fusion module combines the refined DINO/rectified branch with the original CLIP branch dynamically for final predictions.
- Experiments on eight remote sensing benchmarks show that DR-Seg achieves state-of-the-art performance, reflecting improved boundary quality while preserving open-vocabulary semantic grounding.
Related Articles

Black Hat Asia
AI Business

Mistral raises $830M, 9fin hits unicorn status, and new Tech.eu Summit speakers unveiled
Tech.eu

ChatGPT costs $20/month. I built an alternative for $2.99.
Dev.to

OpenAI shifts to usage-based pricing for Codex in ChatGPT business plans
THE DECODER

Why I built an AI assistant that doesn't know who you are
Dev.to