Rethinking Representations for Cross-Domain Infrared Small Target Detection: A Generalizable Perspective from the Frequency Domain

arXiv cs.CV / 4/3/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that infrared small target detection (IRSTD) depends heavily on how discriminative learned representations are, yet many approaches fail to generalize under inevitable cross-domain distribution shifts.
  • It proposes S$^2$CPNet, rethinking IRSTD representations from a frequency-domain perspective and identifying spectral phase inconsistencies as the key signal of domain discrepancy.
  • To improve cross-domain robustness, the method introduces a phase rectification module (PRM) for more generalizable target awareness.
  • It further enhances representation quality by using an orthogonal attention mechanism (OAM) in skip connections to preserve positional information, and selective style recomposition (SSR) to reduce overfitting to domain-specific patterns.
  • Experiments across three IRSTD datasets show the approach delivers state-of-the-art performance across diverse cross-domain test settings.

Abstract

The accurate target-background separation in infrared small target detection (IRSTD) highly depends on the discriminability of extracted representations. However, most existing methods are confined to domain-consistent settings, while overlooking whether such discriminability can generalize to unseen domains. In practice, distribution shifts between training and testing data are inevitable due to variations in observational conditions and environmental factors. Meanwhile, the intrinsic indistinctiveness of infrared small targets aggravates overfitting to domain-specific patterns. Consequently, the detection performance of models trained on source domains can be severely degraded when deployed in unseen domains. To address this challenge, we propose a spatial-spectral collaborative perception network (S^2CPNet) for cross-domain IRSTD. Moving beyond conventional spatial learning pipelines, we rethink IRSTD representations from a frequency perspective and reveal inconsistencies in spectral phase as the primary manifestation of domain discrepancies. Based on this insight, we develop a phase rectification module (PRM) to derive generalizable target awareness. Then, we employ an orthogonal attention mechanism (OAM) in skip connections to preserve positional information while refining informative representations. Moreover, the bias toward domain-specific patterns is further mitigated through selective style recomposition (SSR). Extensive experiments have been conducted on three IRSTD datasets, and the proposed method consistently achieves state-of-the-art performance under diverse cross-domain settings.

Rethinking Representations for Cross-Domain Infrared Small Target Detection: A Generalizable Perspective from the Frequency Domain | AI Navigate