BALTIC: A Benchmark and Cross-Domain Strategy for 3D Reconstruction Across Air and Underwater Domains Under Varying Illumination

arXiv cs.CV / 4/22/2026

📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces BALTIC, a controlled benchmark for evaluating 3D reconstruction methods under systematic changes in medium (air vs. underwater) and lighting conditions (ambient, artificial, mixed).
  • BALTIC includes 13 datasets with added diversity in motion types, scanning patterns, and initialization trajectories, and it supports accurate ground-truth pose estimation via a custom water tank setup with a monocular camera and HTC Vive tracker.
  • The study measures cross-domain reconstruction performance by augmenting underwater sequences with a small set of in-air views captured under similar lighting conditions, then evaluating Structure-from-Motion (COLMAP) for both trajectory accuracy and scene geometry.
  • The reconstructed outputs are used to train and assess Neural Radiance Fields and 3D Gaussian Splatting models, with evaluation against ground-truth trajectories, in-air references, and both perceptual/photometric rendering metrics.
  • Results indicate that, under controlled texture-consistent conditions, 3D Gaussian Splatting with simple preprocessing (such as white-balance correction) can match specialized underwater methods, but robustness drops in more complex and heterogeneous real-world environments.

Abstract

Robust 3D reconstruction across varying environmental conditions remains a critical challenge for robotic perception, particularly when transitioning between air and water. To address this, we introduce BALTIC, a controlled benchmark designed to systematically evaluate modern 3D reconstruction methods under variations in medium and lighting. The benchmark comprises 13 datasets spanning two media (air and water) and three lighting conditions (ambient, artificial, and mixed), with additional variations in motion type, scanning pattern, and initialization trajectory, resulting in a diverse set of sequences. Our experimental setup features a custom water tank equipped with a monocular camera and an HTC Vive tracker, enabling accurate ground-truth pose estimation. We further investigate cross-domain reconstruction by augmenting underwater image sequences with a small number of in-air views captured under similar lighting conditions. We evaluate Structure-from-Motion reconstruction using COLMAP in terms of both trajectory accuracy and scene geometry, and use these reconstructions as input to Neural Radiance Fields and 3D Gaussian Splatting methods. The resulting models are assessed against ground-truth trajectories and in-air references, while rendered outputs are compared using perceptual and photometric metrics. Additionally, we perform a color restoration analysis to evaluate radiometric consistency across domains. Our results show that under controlled, texture-consistent conditions, Gaussian Splatting with simple preprocessing (e.g., white balance correction) can achieve performance comparable to specialized underwater methods, although its robustness decreases in more complex and heterogeneous real-world environments