Unlocking Optical Prior: Spectrum-Guided Knowledge Transfer for SAR Generalized Category Discovery

arXiv cs.CV / 4/27/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper tackles the difficulty of applying Generalized Category Discovery (GCD) to label-scarce SAR data by addressing cross-modal incompatibility between optical foundation models’ priors and SAR imagery.
It introduces the Modal Discrepancy Curve (MDC), modeling cross-modal mismatch as a structured frequency-domain descriptor based on spectral energy distributions.
Using MDC, the authors propose MCPT, a paired optical–SAR pre-training framework that turns MDC into learnable tokens via Adaptive Frequency Tokenization (AFT) and refines features with Frequency-aware Expert Refinement (FER) in a band-wise, discrepancy-aware way.
The approach uses contrastive learning to align refined embeddings across optical and SAR modalities, then transfers the learned SAR representations to downstream single-modal SAR-GCD tasks.
Experiments on multiple mainstream datasets show state-of-the-art results, suggesting that frequency-domain discrepancy modeling can more effectively transfer optical prior into SAR.

Abstract

Generalized Category Discovery (GCD) holds significant promise for the label-scarce Synthetic Aperture Radar (SAR) domain, yet its efficacy is severely constrained by the cross-modal incompatibility between the inherent optical prior of the Large Vision Models (LVMs) and SAR imagery. Existing domain adaptation methods often lack an inductive bias that reflects imaging characteristics, consequently failing to effectively transfer optical prior into the SAR domain. To address this issue, the Modal Discrepancy Curve (MDC) is introduced to model cross-modal discrepancy as a structured frequency-domain descriptor derived from spectral energy distributions. Leveraging this formulation, we propose the MDC-guided Cross-modal Prior Transfer (MCPT) framework, a pre-training paradigm that operates on paired optical-SAR data. Within this framework, Adaptive Frequency Tokenization (AFT) converts the MDC into learnable tokens, and Frequency-aware Expert Refinement (FER) performs band-wise discrepancy-aware feature refinement using these tokens. Based on the refined representations, contrastive learning aligns refined embeddings across modalities and internalizes the adaptation pattern. Ultimately, the superior SAR feature representation capability learned during paired pre-training is applied to downstream single-modal SAR-GCD tasks. Extensive experiments demonstrate state-of-the-art performance across multiple mainstream datasets, indicating that frequency-domain discrepancy modeling enables more effective adaptation of optical prior to SAR imagery.

Subagents: The Building Block of Agentic AI

Dev.to

DeepSeek-V4 Models Could Change Global AI Race

AI Business

Got OpenAI's privacy filter model running on-device via ExecuTorch

Reddit r/LocalLLaMA

The Agent-Skill Illusion: Why Prompt-Based Control Fails in Multi-Agent Business Consulting Systems

Dev.to

We Built a Voice AI Receptionist in 8 Weeks — Every Decision We Made and Why

Dev.to

Unlocking Optical Prior: Spectrum-Guided Knowledge Transfer for SAR Generalized Category Discovery

Key Points

Abstract

Related Articles

Subagents: The Building Block of Agentic AI

DeepSeek-V4 Models Could Change Global AI Race

Got OpenAI's privacy filter model running on-device via ExecuTorch

The Agent-Skill Illusion: Why Prompt-Based Control Fails in Multi-Agent Business Consulting Systems

We Built a Voice AI Receptionist in 8 Weeks — Every Decision We Made and Why

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer