An Open-Source LiDAR and Monocular Off-Road Autonomous Navigation Stack

arXiv cs.RO / 4/6/2026

📰 NewsSignals & Early TrendsModels & Research

Key Points

  • The paper introduces an open-source off-road autonomous navigation stack that supports both LiDAR-based and monocular 3D perception pipelines for obstacle detection in unstructured terrain.
  • For the monocular approach, it uses zero-shot depth prediction from Depth Anything V2 and performs metric depth rescaling with sparse SLAM measurements via VINS-Mono, avoiding task-specific training.
  • It improves robustness by applying edge-masking to reduce obstacle “hallucinations” from depth estimation and adding temporal smoothing to counter SLAM instability.
  • The produced point cloud is converted into a robot-centric 2.5D elevation map used for costmap-based planning.
  • Evaluations in Isaac Sim and real-world environments show the monocular setup can match high-resolution LiDAR performance in most scenarios, and the authors open-source the stack and simulation environment for reproducible benchmarking.

Abstract

Off-road autonomous navigation demands reliable 3D perception for robust obstacle detection in challenging unstructured terrain. While LiDAR is accurate, it is costly and power-intensive. Monocular depth estimation using foundation models offers a lightweight alternative, but its integration into outdoor navigation stacks remains underexplored. We present an open-source off-road navigation stack supporting both LiDAR and monocular 3D perception without task-specific training. For the monocular setup, we combine zero-shot depth prediction (Depth Anything V2) with metric depth rescaling using sparse SLAM measurements (VINS-Mono). Two key enhancements improve robustness: edge-masking to reduce obstacle hallucination and temporal smoothing to mitigate the impact of SLAM instability. The resulting point cloud is used to generate a robot-centric 2.5D elevation map for costmap-based planning. Evaluated in photorealistic simulations (Isaac Sim) and real-world unstructured environments, the monocular configuration matches high-resolution LiDAR performance in most scenarios, demonstrating that foundation-model-based monocular depth estimation is a viable LiDAR alternative for robust off-road navigation. By open-sourcing the navigation stack and the simulation environment, we provide a complete pipeline for off-road navigation as well as a reproducible benchmark. Code available at https://github.com/LARIAD/Offroad-Nav.