Learning biophysical models of gene regulation with probability flow matching

arXiv cs.LG / 4/29/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes Probability Flow Matching (PFM), a scalable method to learn stochastic, biophysically consistent models of gene regulation from time-resolved single-cell measurements.
  • Using three hematopoiesis datasets, the authors show that achieving similar interpolation accuracy is not sufficient: only biophysically consistent formulations recover distinct, mechanism-level dynamics such as lineage transitions and fate specification.
  • The study demonstrates that PFM can handle unbalanced cell populations, allowing simultaneous inference of both proliferation and death dynamics.
  • Overall, the results position PFM as a bridge between mechanistic modeling and single-cell omics that can improve interpretability and generalization to perturbations and new conditions.

Abstract

Cellular differentiation is governed by gene regulatory networks, the high-dimensional stochastic biochemical systems that determine the transcriptional landscape and mediate cellular responses to signals and perturbations. Although single-cell RNA sequencing provides quantitative snapshots of the transcriptome, current methods for inferring gene-regulatory dynamics often lack mechanistic interpretability and fail to generalize to unseen conditions. Here we introduce Probability Flow Matching (PFM), a scalable framework for learning biophysically consistent stochastic processes directly from time-resolved single-cell measurements. Applying PFM to three hematopoiesis datasets, we show that models with similar interpolation accuracy can encode fundamentally different dynamics, with only biophysically consistent formulations accurately capturing mechanisms of lineage transitions, fate specification, and gene perturbation responses. We further demonstrate that PFM accommodates unbalanced populations, enabling simultaneous inference of cellular proliferation and death dynamics. Together, these results establish PFM as a flexible, scalable framework for integrating mechanistic modeling with single-cell omics.