Drifting Fields are not Conservative

arXiv cs.LG / 4/9/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies drifting generative models that move samples toward the data distribution using a vector-valued drift field, and tests whether this corresponds to optimizing a scalar loss.
  • It finds that, in general, drift fields are not conservative and therefore cannot be expressed as the gradient of a scalar potential.
  • The work identifies position-dependent normalization as the main source of the non-conservatism.
  • It shows that the Gaussian kernel is a special case where the normalization does not break conservatism, making the drift field exactly a gradient of a scalar function.
  • The authors propose an alternative “sharp kernel” normalization that restores conservatism for any radial kernel, and conclude that although drift-field matching can be more general than loss minimization, practical gains are minimal, so loss-based training is preferable.

Abstract

Drifting models generate high-quality samples in a single forward pass by transporting generated samples toward the data distribution using a vector valued drift field. We investigate whether this procedure is equivalent to optimizing a scalar loss and find that, in general, it is not: drift fields are not conservative - they cannot be written as the gradient of any scalar potential. We identify the position-dependent normalization as the source of non-conservatism. The Gaussian kernel is the unique exception where the normalization is harmless and the drift field is exactly the gradient of a scalar function. Generalizing this, we propose an alternative normalization via a related kernel (the sharp kernel) which restores conservatism for any radial kernel, yielding well-defined loss functions for training drifting models. While we identify that the drifting field matching objective is strictly more general than loss minimization, as it can implement non-conservative transport fields that no scalar loss can reproduce, we observe that practical gains obtained utilizing this flexibility are minimal. We thus propose to train drifting models with the conceptually simpler formulations utilizing loss functions.