AirSplat: Alignment and Rating for Robust Feed-Forward 3D Gaussian Splatting

arXiv cs.CV / 3/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces AirSplat, a training framework aimed at making 3D Vision Foundation Models more usable for pose-free, high-fidelity novel view synthesis (NVS) despite challenges in direct transfer.
  • AirSplat’s Self-Consistent Pose Alignment (SCPA) adds a training-time feedback loop to align supervision at the pixel level and reduce pose–geometry mismatches.
  • It also proposes Rating-based Opacity Matching (ROM), which uses a sparse-view NVS teacher’s local 3D consistency to filter out degraded 3D Gaussian primitives.
  • Experiments on large-scale benchmarks report significantly improved reconstruction quality over existing state-of-the-art pose-free NVS methods.

Abstract

While 3D Vision Foundation Models (3DVFMs) have demonstrated remarkable zero-shot capabilities in visual geometry estimation, their direct application to generalizable novel view synthesis (NVS) remains challenging. In this paper, we propose AirSplat, a novel training framework that effectively adapts the robust geometric priors of 3DVFMs into high-fidelity, pose-free NVS. Our approach introduces two key technical contributions: (1) Self-Consistent Pose Alignment (SCPA), a training-time feedback loop that ensures pixel-aligned supervision to resolve pose-geometry discrepancy; and (2) Rating-based Opacity Matching (ROM), which leverages the local 3D geometry consistency knowledge from a sparse-view NVS teacher model to filter out degraded primitives. Experimental results on large-scale benchmarks demonstrate that our method significantly outperforms state-of-the-art pose-free NVS approaches in reconstruction quality. Our AirSplat highlights the potential of adapting 3DVFMs to enable simultaneous visual geometry estimation and high-quality view synthesis.

AirSplat: Alignment and Rating for Robust Feed-Forward 3D Gaussian Splatting | AI Navigate