ClipGStream: Clip-Stream Gaussian Splatting for Any Length and Any Motion Multi-View Dynamic Scene Reconstruction

arXiv cs.CV / 4/16/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces ClipGStream, a hybrid dynamic 3D scene reconstruction framework designed for long, large-motion multi-view sequences that are difficult for prior methods.
  • Unlike Frame-Stream approaches that optimize per-frame (scaling well but with weaker temporal stability) or Clip approaches that optimize locally (with higher memory use and limited length), ClipGStream performs stream optimization at the clip level.
  • ClipGStream models motion using clip-independent spatio-temporal fields and residual anchor compensation for local variation, while reusing inherited anchors/decoders to preserve structural consistency across clips.
  • The clip-stream design aims to deliver flicker-free reconstructions with improved temporal coherence while reducing memory overhead, targeting VR/MR/XR-ready dynamic content.
  • Experiments report state-of-the-art reconstruction quality and efficiency compared with existing dynamic Gaussian baselines, with a project page provided.

Abstract

Dynamic 3D scene reconstruction is essential for immersive media such as VR, MR, and XR, yet remains challenging for long multi-view sequences with large-scale motion. Existing dynamic Gaussian approaches are either Frame-Stream, offering scalability but poor temporal stability, or Clip, achieving local consistency at the cost of high memory and limited sequence length. We propose ClipGStream, a hybrid reconstruction framework that performs stream optimization at the clip level rather than the frame level. The sequence is divided into short clips, where dynamic motion is modeled using clip-independent spatio-temporal fields and residual anchor compensation to capture local variations efficiently, while inter-clip inherited anchors and decoders maintain structural consistency across clips. This Clip-Stream design enables scalable, flicker-free reconstruction of long dynamic videos with high temporal coherence and reduced memory overhead. Extensive experiments demonstrate that ClipGStream achieves state-of-the-art reconstruction quality and efficiency. The project page is available at: https://liangjie1999.github.io/ClipGStreamWeb/