Towards All-Day Perception for Off-Road Driving: A Large-Scale Multispectral Dataset and Comprehensive Benchmark

arXiv cs.CV / 5/1/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research

Key Points

  • The paper introduces IRON, a large-scale infrared (IR) dataset for off-road temporal freespace detection across all-day/night conditions, aiming to overcome weak visible-light perception at night.
  • IRON includes 24,314 densely annotated IR images with synchronized RGB data across diverse scenes and lighting, addressing the scarcity of annotated IR off-road resources.
  • The authors propose IRONet, a flow-free temporal model that mitigates inter-frame inconsistencies by aggregating historical context through a memory-attention mechanism and a dedicated mask decoder.
  • On the IRON dataset, IRONet sets new state-of-the-art results with real-time inference, achieving 82.93% IoU and 90.66% F1 score.
  • IRONet also shows strong cross-modality generalization by performing robustly on RGB-based benchmarks (ORFD and Rellis), supporting broader applicability beyond IR-only perception.

Abstract

Off-road nighttime autonomous driving suffers from unreliable visible-light perception, making infrared modality crucial for accurate freespace detection. However, progress remains limited due to the scarcity of annotated infrared off-road datasets and the inter-frame inconsistencies inherent to current single-frame methods. To address these gaps, we present the IRON dataset, which, to our knowledge, is the first large-scale infrared dataset for off-road temporal freespace detection under all-day conditions, with strong support for nighttime perception. The dataset comprises 24,314 densely annotated infrared images with synchronized RGB images in diverse scenes and different light conditions. Building upon this dataset, we propose IRONet, a novel flow-free framework for temporal freespace detection that addresses inter-frame inconsistencies by aggregating historical context via a memory-attention mechanism and a carefully designed mask decoder. On our IRON dataset, IRONet achieves state-of-the-art performance, reaching 82.93%(+1.19%) IoU and 90.66%(+0.71%) F1 score at real-time inference. Remarkably, IRONet also exhibits robust generalization to RGB modalities on ORFD and Rellis datasets. Overall, our work establishes a foundation for reliable all-day off-road autonomous driving and future research in infrared temporal perception. The code and IRON dataset are available at https://github.com/wsnbws/IRON.