SparseCam4D: Spatio-Temporally Consistent 4D Reconstruction from Sparse Cameras

arXiv cs.CV / 3/30/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces SparseCam4D, a sparse-camera framework for dynamic (4D) reconstruction aimed at replacing expensive dense synchronized camera lab setups.
  • Its core contribution is a Spatio-Temporal Distortion Field that models and corrects inconsistencies in generative observations across both spatial and temporal dimensions.
  • The authors present an end-to-end pipeline to reconstruct 4D scenes from sparse, uncalibrated camera inputs.
  • Experiments on multi-camera dynamic scene benchmarks show spatio-temporally consistent, high-fidelity renderings that outperform prior approaches.

Abstract

High-quality 4D reconstruction enables photorealistic and immersive rendering of the dynamic real world. However, unlike static scenes that can be fully captured with a single camera, high-quality dynamic scenes typically require dense arrays of tens or even hundreds of synchronized cameras. Dependence on such costly lab setups severely limits practical scalability. The reliance on such costly lab setups severely limits practical scalability. To this end, we propose a sparse-camera dynamic reconstruction framework that exploits abundant yet inconsistent generative observations. Our key innovation is the Spatio-Temporal Distortion Field, which provides a unified mechanism for modeling inconsistencies in generative observations across both spatial and temporal dimensions. Building on this, we develop a complete pipeline that enables 4D reconstruction from sparse and uncalibrated camera inputs. We evaluate our method on multi-camera dynamic scene benchmarks, achieving spatio-temporally consistent high-fidelity renderings and significantly outperforming existing approaches.