MoRGS: Efficient Per-Gaussian Motion Reasoning for Streamable Dynamic 3D Scenes

arXiv cs.CV / 3/27/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces MoRGS, an online framework for dynamic (4D) 3D scene reconstruction from streaming multi-view inputs under low-latency constraints.
  • It argues that prior online 3D Gaussian Splatting approaches do not learn physically meaningful per-Gaussian motion because they optimize only with photometric loss, causing motion to overfit pixel residuals.
  • MoRGS adds explicit per-Gaussian motion reasoning by using optical flow from a sparse set of key views as lightweight motion cues to regularize motion beyond appearance supervision.
  • To handle sparse flow supervision, it learns a per-Gaussian motion offset field that aligns projected 3D motion with observed optical flow across time and views.
  • The method also introduces per-Gaussian motion confidence to distinguish dynamic from static Gaussians, improving temporal consistency and speeding up modeling of large motions, with experiments showing state-of-the-art quality and motion fidelity among online methods.

Abstract

Online reconstruction of dynamic scenes aims to learn from streaming multi-view inputs under low-latency constraints. The fast training and real-time rendering capabilities of 3D Gaussian Splatting have made on-the-fly reconstruction practically feasible, enabling online 4D reconstruction. However, existing online approaches, despite their efficiency and visual quality, fail to learn per-Gaussian motion that reflects true scene dynamics. Without explicit motion cues, appearance and motion are optimized solely under photometric loss, causing per-Gaussian motion to chase pixel residuals rather than true 3D motion. To address this, we propose MoRGS, an efficient online per-Gaussian motion reasoning framework that explicitly models per-Gaussian motion to improve 4D reconstruction quality. Specifically, we leverage optical flow on a sparse set of key views as lightweight motion cues that regularize per-Gaussian motion beyond photometric supervision. To compensate for the sparsity of flow supervision, we learn a per-Gaussian motion offset field that reconciles discrepancies between projected 3D motion and observed flow across views and time. In addition, we introduce a per-Gaussian motion confidence that separates dynamic from static Gaussians and weights Gaussian attribute residual updates, thereby suppressing redundant motion in static regions for better temporal consistency and accelerating the modeling of large motions. Extensive experiments demonstrate that MoRGS achieves state-of-the-art reconstruction quality and motion fidelity among online methods, while maintaining streamable performance.