AI Navigate

UAV-CB: A Complex-Background RGB-T Dataset and Local Frequency Bridge Network for UAV Detection

arXiv cs.CV / 3/19/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces UAV-CB, a new RGB-T UAV detection dataset curated to emphasize complex low-altitude backgrounds and camouflage.
  • It also presents the Local Frequency Bridge Network (LFBNet), which models features in localized frequency space to bridge both the frequency-spatial fusion gap and the cross-modality discrepancy gap in RGB-T fusion.
  • Extensive experiments on UAV-CB and public benchmarks show that LFBNet achieves state-of-the-art detection performance and robustness under camouflage and cluttered conditions.
  • This work highlights a frequency-aware multimodal perception approach that could improve real-world UAV detection for perception and defense applications.

Abstract

Detecting Unmanned Aerial Vehicles (UAVs) in low-altitude environments is essential for perception and defense systems but remains highly challenging due to complex backgrounds, camouflage, and multimodal interference. In real-world scenarios, UAVs are frequently visually blended with surrounding structures such as buildings, vegetation, and power lines, resulting in low contrast, weak boundaries, and strong confusion with cluttered background textures. Existing UAV detection datasets, though diverse, are not specifically designed to capture these camouflage and complex-background challenges, which limits progress toward robust real-world perception. To fill this gap, we construct UAV-CB, a new RGB-T UAV detection dataset deliberately curated to emphasize complex low-altitude backgrounds and camouflage characteristics. Furthermore, we propose the Local Frequency Bridge Network (LFBNet), which models features in localized frequency space to bridge both the frequency-spatial fusion gap and the cross-modality discrepancy gap in RGB-T fusion. Extensive experiments on UAV-CB and public benchmarks demonstrate that LFBNet achieves state-of-the-art detection performance and strong robustness under camouflaged and cluttered conditions, offering a frequency-aware perspective on multimodal UAV perception in real-world applications.