LuMon: A Comprehensive Benchmark and Development Suite with Novel Datasets for Lunar Monocular Depth Estimation

arXiv cs.CV / 4/13/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces LuMon, a lunar-focused benchmarking framework for monocular depth estimation (MDE) aimed at autonomous lunar rover navigation under Moon-specific visual challenges like harsh shadows and textureless regolith.
  • LuMon provides new evaluation datasets with stereo-derived, high-quality depth ground truth from the real Chang’e-3 mission as well as the CHERI dark analog dataset, addressing prior lack of realistic conditions and metric ground truth.
  • A systematic zero-shot evaluation of state-of-the-art MDE architectures is reported across synthetic, analog, and real datasets, with testing tailored to mission-critical scenarios such as craters, rocks, extreme shading, and differing depth ranges.
  • The authors propose a sim-to-real domain adaptation baseline by fine-tuning a foundation model on synthetic data; results show large in-domain gains but limited generalization to authentic lunar imagery, indicating a persistent cross-domain transfer gap.
  • The study concludes with an analysis of current network limitations and positions LuMon as a standard foundation to guide future extraterrestrial perception and domain adaptation research.

Abstract

Monocular Depth Estimation (MDE) is crucial for autonomous lunar rover navigation using electro-optical cameras. However, deploying terrestrial MDE networks to the Moon brings a severe domain gap due to harsh shadows, textureless regolith, and zero atmospheric scattering. Existing evaluations rely on analogs that fail to replicate these conditions and lack actual metric ground truth. To address this, we present LuMon, a comprehensive benchmarking framework to evaluate MDE methods for lunar exploration. We introduce novel datasets featuring high-quality stereo ground truth depth from the real Chang'e-3 mission and the CHERI dark analog dataset. Utilizing this framework, we conduct a systematic zero-shot evaluation of state-of-the-art architectures across synthetic, analog, and real datasets. We rigorously assess performance against mission critical challenges like craters, rocks, extreme shading, and varying depth ranges. Furthermore, we establish a sim-to-real domain adaptation baseline by fine tuning a foundation model on synthetic data. While this adaptation yields drastic in-domain performance gains, it exhibits minimal generalization to authentic lunar imagery, highlighting a persistent cross-domain transfer gap. Our extensive analysis reveals the inherent limitations of current networks and sets a standard foundation to guide future advancements in extraterrestrial perception and domain adaptation.