WFM: 3D Wavelet Flow Matching for Ultrafast Multi-Modal MRI Synthesis

arXiv cs.CV / 4/24/2026

📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research

共有:

Key Points

The paper argues that diffusion-model MRI synthesis is computationally inefficient because it starts from pure noise, ignoring structural information already contained in available MRI sequences.
It introduces WFM (Wavelet Flow Matching), which learns a flow from an informed prior—computed as the mean of conditioning modalities in wavelet space—enabling high-quality synthesis in only 1–2 integration steps.
Using a single 82M-parameter, class-conditioned model, WFM synthesizes all four BraTS MRI modalities (T1, T1c, T2, FLAIR), replacing four separate diffusion models and reducing parameter usage from 326M.
On BraTS 2024, WFM reports 26.8 dB PSNR and 0.94 SSIM, achieving results within 1–2 dB of diffusion baselines while being 250–1000× faster (0.16–0.64s vs. 160s per volume).
The authors provide code to support adoption, and the demonstrated speed-quality balance is presented as making real-time MRI synthesis feasible for clinical workflows.

Abstract

Diffusion models have achieved remarkable quality in multi-modal MRI synthesis, but their computational cost (hundreds of sampling steps and separate models per modality) limits clinical deployment. We observe that this inefficiency stems from an unnecessary starting point: diffusion begins from pure noise, discarding the structural information already present in available MRI sequences. We propose WFM (Wavelet Flow Matching), which instead learns a direct flow from an informed prior, the mean of conditioning modalities in wavelet space, to the target distribution. Because the source and target share underlying anatomy and differ primarily in contrast, this formulation enables accurate synthesis in just 1-2 integration steps. A single 82M-parameter model with class conditioning synthesizes all four BraTS modalities (T1, T1c, T2, FLAIR), replacing four separate diffusion models totaling 326M parameters. On BraTS 2024, WFM achieves 26.8 dB PSNR and 0.94 SSIM, within 1-2 dB of diffusion baselines, while running 250-1000x faster (0.16-0.64s vs. 160s per volume). This speed-quality trade-off makes real-time MRI synthesis practical for clinical workflows. Code is available at https://github.com/yalcintur/WFM.