ViewSplat: View-Adaptive Dynamic Gaussian Splatting for Feed-Forward Synthesis

arXiv cs.CV / 3/27/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • ViewSplat is a view-adaptive 3D Gaussian splatting network for novel view synthesis from unposed images that targets the fidelity gap in existing feed-forward (single-step) Gaussian splatting methods.
  • Instead of regressing one fixed set of Gaussian primitives for all viewpoints, it learns a view-adaptable latent representation with dynamic MLPs that produce view-dependent residual updates to Gaussian attributes (position, scale, rotation, opacity, color).
  • The approach shifts from static primitive regression to view-adaptive dynamic splatting, enabling primitives to correct initial estimation errors during rendering.
  • Experiments report state-of-the-art visual fidelity while preserving fast performance, including 17 FPS inference and 154 FPS real-time rendering.
  • The work is presented as an arXiv announcement and contributes a new architectural idea for improving reconstruction quality without returning to per-scene optimization.

Abstract

We present ViewSplat, a view-adaptive 3D Gaussian splatting network for novel view synthesis from unposed images. While recent feed-forward 3D Gaussian splatting has significantly accelerated 3D scene reconstruction by bypassing per-scene optimization, a fundamental fidelity gap remains. We attribute this bottleneck to the limited capacity of single-step feed-forward networks to regress static Gaussian primitives that satisfy all viewpoints. To address this limitation, we shift the paradigm from static primitive regression to view-adaptive dynamic splatting. Instead of a rigid Gaussian representation, our pipeline learns a view-adaptable latent representation. Specifically, ViewSplat initially predicts base Gaussian primitives alongside the weights of dynamic MLPs. During rendering, these MLPs take target view coordinates as input and predict view-dependent residual updates for each Gaussian attribute (i.e., 3D position, scale, rotation, opacity, and color). This mechanism, which we term view-adaptive dynamic splatting, allows each primitive to rectify initial estimation errors, effectively capturing high-fidelity appearances. Extensive experiments demonstrate that ViewSplat achieves state-of-the-art fidelity while maintaining fast inference (17 FPS) and real-time rendering (154 FPS).