AI Navigate

Semantic Segmentation and Depth Estimation for Real-Time Lunar Surface Mapping Using 3D Gaussian Splatting

arXiv cs.CV / 3/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper presents a real-time lunar surface mapping framework that combines dense perception models with a 3D Gaussian Splatting (3DGS) representation to enable detailed mapping.
  • It benchmarks models on synthetic LuPNT data, selecting a stereo dense depth estimation model based on Gated Recurrent Units for speed/accuracy and a CNN for semantic segmentation.
  • By using ground-truth poses to decouple local scene understanding from global state estimation, it reconstructs a 120-meter traverse with approximately 3 cm height accuracy, outperforming a traditional LiDAR-free point cloud baseline.
  • The resulting 3DGS map supports novel view synthesis and serves as a foundation for a full SLAM system with potential joint map and pose optimization.
  • The findings indicate that fusing semantic segmentation with dense depth and learned map representations is effective for creating detailed, large-scale lunar maps for future missions.

Abstract

Navigation and mapping on the lunar surface require robust perception under challenging conditions, including poorly textured environments, high-contrast lighting, and limited computational resources. This paper presents a real-time mapping framework that integrates dense perception models with a 3D Gaussian Splatting (3DGS) representation. We first benchmark several models on synthetic datasets generated with the LuPNT simulator, selecting a stereo dense depth estimation model based on Gated Recurrent Units for its balance of speed and accuracy in depth estimation, and a convolutional neural network for its superior performance in detecting semantic segments. Using ground truth poses to decouple the local scene understanding from the global state estimation, our pipeline reconstructs a 120-meter traverse with a geometric height accuracy of approximately 3 cm, outperforming a traditional point cloud baseline without LiDAR. The resulting 3DGS map enables novel view synthesis and serves as a foundation for a full SLAM system, where its capacity for joint map and pose optimization would offer significant advantages. Our results demonstrate that combining semantic segmentation and dense depth estimation with learned map representations is an effective approach for creating detailed, large-scale maps to support future lunar surface missions.