DinoRADE: Full Spectral Radar-Camera Fusion with Vision Foundation Model Features for Multi-class Object Detection in Adverse Weather

arXiv cs.CV / 4/10/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • DinoRADE is a radar-centered multi-modal perception pipeline designed to improve object detection robustness in adverse weather, using dense FMCW radar tensors fused with camera vision features.
  • The method aggregates vision features around camera-transformed reference points using deformable cross-attention to better recover fine-grained spatial detail needed for detecting small vulnerable road users (VRUs).
  • Vision input comes from a DINOv3 vision foundation model, enabling feature extraction that is then fused with radar features for multi-class detection.
  • The authors evaluate on the K-Radar dataset across all weather conditions, report performance per five object classes, and achieve a 12.1% improvement over prior radar-camera approaches.
  • Code is released publicly under the RADE-Net repository, supporting reproducibility and further research on radar-camera fusion for safety-critical driving perception.

Abstract

Reliable and weather-robust perception systems are essential for safe autonomous driving and typically employ multi-modal sensor configurations to achieve comprehensive environmental awareness. While recent automotive FMCW Radar-based approaches achieved remarkable performance on detection tasks in adverse weather conditions, they exhibited limitations in resolving fine-grained spatial details particularly critical for detecting smaller and vulnerable road users (VRUs). Furthermore, existing research has not adequately addressed VRU detection in adverse weather datasets such as K-Radar. We present DinoRADE, a Radar-centered detection pipeline that processes dense Radar tensors and aggregates vision features around transformed reference points in the camera perspective via deformable cross-attention. Vision features are provided by a DINOv3 Vision Foundation Model. We present a comprehensive performance evaluation on the K-Radar dataset in all weather conditions and are among the first to report detection performance individually for five object classes. Additionally, we compare our method with existing single-class detection approaches and outperform recent Radar-camera approaches by 12.1%. The code is available under https://github.com/chr-is-tof/RADE-Net.