ZoomSpec: A Physics-Guided Coarse-to-Fine Framework for Wideband Spectrum Sensing

arXiv cs.CV / 4/16/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces ZoomSpec, a physics-guided coarse-to-fine framework for wideband spectrum sensing in low-altitude monitoring, addressing issues like heterogeneous protocols, very large bandwidths, and non-stationary SNR.
  • It proposes a Log-Space STFT (LS-STFT) to reduce domain mismatch seen in image-like spectrogram methods by sharpening narrowband structures while preserving constant relative resolution.
  • ZoomSpec uses a lightweight Coarse Proposal Net (CPN) to rapidly screen the entire band, then an Adaptive Heterodyne Low-Pass (AHLP) module to align center frequencies, apply bandwidth-matched filtering, and safely decimate to suppress out-of-band interference.
  • For fine-grained results, a Fine Recognition Net (FRN) combines purified time-domain I/Q and spectral magnitude using dual-domain attention to jointly refine temporal boundaries and perform modulation classification.
  • Experiments on the SpaceNet real-world dataset report state-of-the-art performance of 78.1 mAP@0.5:0.95 with improved stability across diverse modulation bandwidths.

Abstract

Wideband spectrum sensing for low-altitude monitoring is critical yet challenging due to heterogeneous protocols,large bandwidths, and non-stationary SNR. Existing data-driven approaches treat spectrograms as natural images,suffering from domain mismatch: they neglect time-frequency resolution constraints and spectral leakage, leading topoor narrowband visibility. This paper proposes ZoomSpec, a physics-guided coarse-to-fine framework integrating signal processing priors with deep learning. We introduce a Log-Space STFT (LS-STFT) to overcome the geometric bottleneck of linear spectrograms, sharpening narrowband structures while maintaining constant relative resolution. A lightweight Coarse Proposal Net (CPN) rapidly screens the full band. To bridge coarse detection and fine recognition, we design an Adaptive Heterodyne Low-Pass (AHLP) module that executes center-frequency aligning, bandwidth-matched filtering, and safe decimation, purifying signals of out-of-band interference. A Fine Recognition Net (FRN) fuses purified time-domain I/Q with spectral magnitude via dual-domain attention to jointly refine temporal boundaries and modulation classification. Evaluations on the SpaceNet real-world dataset demonstrate state-of-the-art 78.1 mAP@0.5:0.95, surpassing existing leaderboard systems with superior stability across diverse modulation bandwidths.