ReFlow: Self-correction Motion Learning for Dynamic Scene Reconstruction

arXiv cs.CV / 4/3/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • ReFlow is introduced as a unified framework for monocular dynamic (4D) scene reconstruction that learns 3D motion directly from raw video via a self-correction mechanism rather than relying on external dense motion guidance like pre-computed optical flow.
  • The method improves initialization using a Complete Canonical Space Construction module to better handle both static and dynamic regions, which are a common source of instability in prior approaches.
  • ReFlow decouples static and dynamic components through Separation-Based Dynamic Scene Modeling to provide more targeted motion supervision and reduce failure modes caused by coupled dynamics.
  • Its core self-correction flow matching combines Full Flow Matching (aligning 3D scene flow with time-varying 2D observations) and Camera Flow Matching (enforcing multi-view consistency for static objects) to enhance robustness and accuracy.
  • Experiments across diverse scenarios reportedly show that ReFlow achieves superior reconstruction quality and robustness, positioning it as a new self-correction paradigm for monocular 4D reconstruction.

Abstract

We present ReFlow, a unified framework for monocular dynamic scene reconstruction that learns 3D motion in a novel self-correction manner from raw video. Existing methods often suffer from incomplete scene initialization for dynamic regions, leading to unstable reconstruction and motion estimation, which often resorts to external dense motion guidance such as pre-computed optical flow to further stabilize and constrain the reconstruction of dynamic components. However, this introduces additional complexity and potential error propagation. To address these issues, ReFlow integrates a Complete Canonical Space Construction module for enhanced initialization of both static and dynamic regions, and a Separation-Based Dynamic Scene Modeling module that decouples static and dynamic components for targeted motion supervision. The core of ReFlow is a novel self-correction flow matching mechanism, consisting of Full Flow Matching to align 3D scene flow with time-varying 2D observations, and Camera Flow Matching to enforce multi-view consistency for static objects. Together, these modules enable robust and accurate dynamic scene reconstruction. Extensive experiments across diverse scenarios demonstrate that ReFlow achieves superior reconstruction quality and robustness, establishing a novel self-correction paradigm for monocular 4D reconstruction.