FlowAnchor: Stabilizing the Editing Signal for Inversion-Free Video Editing

arXiv cs.CV / 4/27/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • FlowAnchor is a training-free framework that enables stable, efficient inversion-free, flow-based video editing by directly steering the sampling trajectory with an editing signal.
  • The method targets a key failure mode of existing video inversion-free approaches, attributing it to instability of the editing signal in high-dimensional video latent spaces caused by poor spatial localization and magnitude attenuation over longer sequences.
  • FlowAnchor anchors both the spatial target (“where to edit”) and the strength of the edit (“how strongly to edit”) to stabilize guidance throughout the editing process.
  • It combines Spatial-aware Attention Refinement (to keep textual guidance aligned with relevant spatial regions) and Adaptive Magnitude Modulation (to maintain adequate editing strength despite length effects).
  • Experiments indicate FlowAnchor produces more faithful results, better temporal consistency, and improved computational efficiency, particularly for multi-object scenes and fast-motion videos.

Abstract

We propose FlowAnchor, a training-free framework for stable and efficient inversion-free, flow-based video editing. Inversion-free editing methods have recently shown impressive efficiency and structure preservation in images by directly steering the sampling trajectory with an editing signal. However, extending this paradigm to videos remains challenging, often failing in multi-object scenes or with increased frame counts. We identify the root cause as the instability of the editing signal in high-dimensional video latent spaces, which arises from imprecise spatial localization and length-induced magnitude attenuation. To overcome this challenge, FlowAnchor explicitly anchors both where to edit and how strongly to edit. It introduces Spatial-aware Attention Refinement, which enforces consistent alignment between textual guidance and spatial regions, and Adaptive Magnitude Modulation, which adaptively preserves sufficient editing strength. Together, these mechanisms stabilize the editing signal and guide the flow-based evolution toward the desired target distribution. Extensive experiments demonstrate that FlowAnchor achieves more faithful, temporally coherent, and computationally efficient video editing across challenging multi-object and fast-motion scenarios. The project page is available at https://cuc-mipg.github.io/FlowAnchor.github.io/.