MotionScale: Reconstructing Appearance, Geometry, and Motion of Dynamic Scenes with Scalable 4D Gaussian Splatting

arXiv cs.CV / 4/1/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • MotionScaleは、モノキュラービデオから動的な4Dシーン(外観・3D形状・時間変化)を高精度に再構成することを目的とした4D Gaussian Splattingベースの新手法です。
  • クラスタ中心の基底変換で運動(motion field)をスケーラブルに表現し、複雑で変化のある動きのパターンを適応的に捉える設計になっています。
  • 長時間シーンでの頑健性のために、背景の新規可視領域への拡張・カメラ姿勢の改良・一時的な影(transient shadows)の明示モデル化と、前景の運動整合性を段階的に高める二段(計複数ステージ)の進行的最適化を導入しています。
  • 実世界ベンチマークで、既存の最先端手法と比べて再構成品質と時間的一貫性(temporal stability)の両面で大幅に優れていると報告されています。

Abstract

Realistic reconstruction of dynamic 4D scenes from monocular videos is essential for understanding the physical world. Despite recent progress in neural rendering, existing methods often struggle to recover accurate 3D geometry and temporally consistent motion in complex environments. To address these challenges, we propose MotionScale, a 4D Gaussian Splatting framework that scales efficiently to large scenes and extended sequences while maintaining high-fidelity structural and motion coherence. At the core of our approach is a scalable motion field parameterized by cluster-centric basis transformations that adaptively expand to capture diverse and evolving motion patterns. To ensure robust reconstruction over long durations, we introduce a progressive optimization strategy comprising two decoupled propagation stages: 1) A background extension stage that adapts to newly visible regions, refines camera poses, and explicitly models transient shadows; 2) A foreground propagation stage that enforces motion consistency through a specialized three-stage refinement process. Extensive experiments on challenging real-world benchmarks demonstrate that MotionScale significantly outperforms state-of-the-art methods in both reconstruction quality and temporal stability. Project page: https://hrzhou2.github.io/motion-scale-web/.