DRoPS: Dynamic 3D Reconstruction of Pre-Scanned Objects

arXiv cs.CV / 3/27/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces DRoPS, a method for dynamic 3D reconstruction from casual videos that uses a static pre-scan of the object as an explicit geometric and appearance prior.
  • DRoPS addresses limitations of prior work under extreme novel viewpoints and highly articulated motion by constraining the solution space and enforcing geometric consistency across frames.
  • Its key technical contributions include a grid-structured, surface-aligned representation using Gaussian primitives organized into pixel grids anchored to the object surface.
  • Motion is parameterized with a CNN conditioned on these grid-aligned primitives, providing strong implicit regularization and tying motion of nearby points together.
  • Experiments report substantial improvements over state of the art in both rendering quality and 3D tracking accuracy.

Abstract

Dynamic scene reconstruction from casual videos has seen recent remarkable progress. Numerous approaches have attempted to overcome the ill-posedness of the task by distilling priors from 2D foundational models and by imposing hand-crafted regularization on the optimized motion. However, these methods struggle to reconstruct scenes from extreme novel viewpoints, especially when highly articulated motions are present. In this paper, we present DRoPS, a novel approach that leverages a static pre-scan of the dynamic object as an explicit geometric and appearance prior. While existing state-of-the-art methods fail to fully exploit the pre-scan, DRoPS leverages our novel setup to effectively constrain the solution space and ensure geometrical consistency throughout the sequence. The core of our novelty is twofold: first, we establish a grid-structured and surface-aligned model by organizing Gaussian primitives into pixel grids anchored to the object surface. Second, by leveraging the grid structure of our primitives, we parameterize motion using a CNN conditioned on those grids, injecting strong implicit regularization and correlating the motion of nearby points. Extensive experiments demonstrate that our method significantly outperforms the current state of the art in rendering quality and 3D tracking accuracy.