DP-DeGauss: Dynamic Probabilistic Gaussian Decomposition for Egocentric 4D Scene Reconstruction

arXiv cs.CV / 4/10/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • DP-DeGauss is introduced as a dynamic probabilistic Gaussian decomposition framework aimed at egocentric (first-person) 4D scene reconstruction, addressing challenges like ego-motion, occlusions, and hand–object interactions.
  • The method builds an initial unified 3D Gaussian set from COLMAP priors, adds learnable category probabilities, and routes Gaussians into specialized deformation branches to separately model background, hands, and objects.
  • It uses category-specific masks plus brightness and motion-flow control to improve both static rendering and dynamic reconstruction quality.
  • Experiments report average performance gains of +1.70dB PSNR over baselines, along with improvements in SSIM and LPIPS.
  • The authors claim the first and state-of-the-art disentanglement of background/hand/object components, enabling more explicit, fine-grained scene understanding and potential editing workflows.

Abstract

Egocentric video is crucial for next-generation 4D scene reconstruction, with applications in AR/VR and embodied AI. However, reconstructing dynamic first-person scenes is challenging due to complex ego-motion, occlusions, and hand-object interactions. Existing decomposition methods are ill-suited, assuming fixed viewpoints or merging dynamics into a single foreground. To address these limitations, we introduce DP-DeGauss, a dynamic probabilistic Gaussian decomposition framework for egocentric 4D reconstruction. Our method initializes a unified 3D Gaussian set from COLMAP priors, augments each with a learnable category probability, and dynamically routes them into specialized deformation branches for background, hands, or object modeling. We employ category-specific masks for better disentanglement and introduce brightness and motion-flow control to improve static rendering and dynamic reconstruction. Extensive experiments show that DP-DeGauss outperforms baselines by +1.70dB in PSNR on average with SSIM and LPIPS gains. More importantly, our framework achieves the first and state-of-the-art disentanglement of background, hand, and object components, enabling explicit, fine-grained separation, paving the way for more intuitive ego scene understanding and editing.