AI Navigate

Leveling3D: Leveling Up 3D Reconstruction with Feed-Forward 3D Gaussian Splatting and Geometry-Aware Generation

arXiv cs.CV / 3/18/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • Leveling3D proposes a unified pipeline that integrates feed-forward 3D reconstruction with geometry-aware generation to improve novel-view synthesis and depth estimation.
  • It introduces a geometry-aware leveling adapter that aligns internal diffusion model knowledge with the geometry prior from the 3D reconstruction, enabling plausible generation in underconstrained artifact regions.
  • The method employs a palette filtering strategy during training and a test-time masking refinement to diversify outputs while preventing messy boundaries along fixing regions.
  • The approach yields state-of-the-art performance on public datasets for novel-view synthesis and depth estimation, by producing enhanced extrapolated views that feed back into feed-forward 3D Gaussian Splatting.

Abstract

Feed-forward 3D reconstruction has revolutionized 3D vision, providing a powerful baseline for downstream tasks such as novel-view synthesis with 3D Gaussian Splatting. Previous works explore fixing the corrupted rendering results with a diffusion model. However, they lack geometric concern and fail at filling the missing area on the extrapolated view. In this work, we introduce Leveling3D, a novel pipeline that integrates feed-forward 3D reconstruction with geometrical-consistent generation to enable holistic simultaneous reconstruction and generation. We propose a geometry-aware leveling adapter, a lightweight technique that aligns internal knowledge in the diffusion model with the geometry prior from the feed-forward model. The leveling adapter enables generation on the artifact area of the extrapolated novel views caused by underconstrained regions of the 3D representation. Specifically, to learn a more diverse distributed generation, we introduce the palette filtering strategy for training, and a test-time masking refinement to prevent messy boundaries along the fixing regions. More importantly, the enhanced extrapolated novel views from Leveling3D could be used as the inputs for feed-forward 3DGS, leveling up the 3D reconstruction. We achieve SOTA performance on public datasets, including tasks such as novel-view synthesis and depth estimation.