From Euler to Dormand-Prince: ODE Solvers for Flow Matching Generative Models

arXiv cs.LG / 5/5/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper shows that Flow Matching generative models require solving an ODE where most compute comes from neural network forward passes, and derives four solvers (Euler, Explicit Midpoint, RK4, and Dormand–Prince 5(4)) directly from Taylor expansion.
  • It provides from-scratch PyTorch implementations of these ODE solvers and benchmarks them on Conditional Flow Matching tasks from 2D toy distributions to MNIST, using sliced Wasserstein distance for quality.
  • The results produce NFE-quality Pareto frontiers, indicating that RK4 with about 80 function evaluations can match the sample quality of Euler with about 200 evaluations.
  • The authors report empirical findings: the learned velocity field becomes sharply stiffer near t=1 (consistent with adaptive solvers allocating more steps near the end), and solver quality differences grow for undertrained/smaller models, meaning solver choice matters more when the model is imperfect.
  • All code and experiment scripts are released publicly, enabling direct reproduction and further experimentation.

Abstract

Sampling from Flow Matching generative models requires solving an ordinary differential equation (ODE) whose computational cost is dominated by neural network forward passes. We derive four classical ODE solvers -- Euler, Explicit Midpoint, Classical Runge-Kutta (RK4), and Dormand-Prince 5(4) -- from first principles via Taylor expansion, implement them from scratch in PyTorch, and systematically benchmark their efficiency on Conditional Flow Matching tasks ranging from 2D toy distributions to MNIST digits. On the quantitative side, we use sliced Wasserstein distance to construct NFE-quality Pareto frontiers,finding that RK4 at 80 function evaluations achieves sample quality comparable to Euler at 200. Beyond reproducing known convergence rates, we report two empirical observations: (1) the Jacobian eigenvalue spectrum of the learned velocity field stiffens sharply near t=1, explaining why the adaptive Dormand-Prince solver automatically concentrates its step budget at the end of the trajectory; (2) the quality gap between low-order and high-order solvers widens for undertrained and smaller models, indicating that solver choice matters most when the model is imperfect. Code and all experiment scripts are publicly available.