BiFM: Bidirectional Flow Matching for Few-Step Image Editing and Generation

arXiv cs.CV / 3/27/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces BiFM (Bidirectional Flow Matching), a unified framework for few-step image editing and generation that improves quality when forward-process approximation is weak.
  • BiFM learns both generation and inversion in a single model by estimating velocity fields in both image→noise and noise→image directions under a shared instantaneous velocity constraint.
  • It uses continuous time-interval supervision with a bidirectional consistency objective and a lightweight time-interval embedding to stabilize training.
  • The bidirectional formulation supports one-step inversion and can be integrated into common diffusion/flow-matching backbones, and experiments show improved performance and editability over existing few-step methods.
  • The approach targets scalability and generalization by avoiding reliance on pretrained generators plus auxiliary modules that many prior few-step inversion methods require.

Abstract

Recent diffusion and flow matching models have demonstrated strong capabilities in image generation and editing by progressively removing noise through iterative sampling. While this enables flexible inversion for semantic-preserving edits, few-step sampling regimes suffer from poor forward process approximation, leading to degraded editing quality. Existing few-step inversion methods often rely on pretrained generators and auxiliary modules, limiting scalability and generalization across different architectures. To address these limitations, we propose BiFM (Bidirectional Flow Matching), a unified framework that jointly learns generation and inversion within a single model. BiFM directly estimates average velocity fields in both ``image \to noise" and ``noise \to image" directions, constrained by a shared instantaneous velocity field derived from either predefined schedules or pretrained multi-step diffusion models. Additionally, BiFM introduces a novel training strategy using continuous time-interval supervision, stabilized by a bidirectional consistency objective and a lightweight time-interval embedding. This bidirectional formulation also enables one-step inversion and can integrate seamlessly into popular diffusion and flow matching backbones. Across diverse image editing and generation tasks, BiFM consistently outperforms existing few-step approaches, achieving superior performance and editability.
広告