Learning to Synergize Semantic and Geometric Priors for Limited-Data Wheat Disease Segmentation

arXiv cs.CV / 4/8/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper targets wheat disease segmentation under limited training data by addressing large intra-class temporal appearance variations across growth stages.
  • It proposes SGPer, which synergizes semantic priors (from pretrained DINOv2) with geometric localization (via SAM) to guide accurate boundary masks.
  • SGPer adds disease-sensitive adapters into both DINOv2 and SAM to align representations with disease-specific characteristics and converts DINOv2 features into dense, category-specific point prompts.
  • It reduces redundant prompts by dynamically filtering candidates using SAM’s iterative mask confidence together with DINOv2-derived semantic consistency.
  • Experiments report state-of-the-art segmentation results on wheat disease and organ benchmarks, with strongest gains in data-constrained settings and improved invariance to temporal appearance changes.

Abstract

Wheat disease segmentation is fundamental to precision agriculture but faces severe challenges from significant intra-class temporal variations across growth stages. Such substantial appearance shifts make collecting a representative dataset for training from scratch both labor-intensive and impractical. To address this, we propose SGPer, a Semantic-Geometric Prior Synergization framework that treats wheat disease segmentation under limited data as a coupled task of disease-specific semantic perception and disease boundary localization. Our core insight is that pretrained DINOv2 provides robust category-aware semantic priors to handle appearance shifts, which can be converted into coarse spatial prompts to guide SAM for the precise localization of disease boundaries. Specifically, SGPer designs disease-sensitive adapters with multiple disease-friendly filters and inserts them into both DINOv2 and SAM to align their pretrained representations with disease-specific characteristics. To operationalize this synergy, SGPer transforms DINOv2-derived features into dense, category-specific point prompts to ensure comprehensive spatial coverage of all disease regions. To subsequently eliminate prompt redundancy and ensure highly accurate mask generation, it dynamically filters these dense candidates by cross-referencing SAM's iterative mask confidence with the category-specific semantic consistency derived from DINOv2. Ultimately, SGPer distills a highly informative set of prompts to activate SAM's geometric priors, achieving precise and robust segmentation that remains strictly invariant to temporal appearance changes. Extensive evaluations demonstrate that SGPer consistently achieves state-of-the-art performance on wheat disease and organ segmentation benchmarks, especially in data-constrained scenarios.