Learning interacting particle systems from unlabeled data

arXiv stat.ML / 4/6/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper tackles learning the interaction potentials of interacting particle systems when observations are available at discrete times but trajectories are missing due to data collection and/or privacy constraints.
  • It proposes a trajectory-free self-test loss function derived from the weak-form stochastic evolution equation of the empirical distribution, designed to estimate potentials without needing labeled trajectories.
  • The loss is quadratic in the potentials, enabling both parametric and nonparametric regression approaches that can scale to large, high-dimensional systems using big-data regimes.
  • Experiments with numerical tests indicate the method outperforms baselines that first reconstruct trajectories via label matching, even when observation time steps are large.
  • The authors provide theoretical results proving convergence of parametric estimators as sample size increases, giving a formal foundation for the estimation method.

Abstract

Learning the potentials of interacting particle systems is a fundamental task across various scientific disciplines. A major challenge is that unlabeled data collected at discrete time points lack trajectory information due to limitations in data collection methods or privacy constraints. We address this challenge by introducing a trajectory-free self-test loss function that leverages the weak-form stochastic evolution equation of the empirical distribution. The loss function is quadratic in potentials, supporting parametric and nonparametric regression algorithms for robust estimation that scale to large, high-dimensional systems with big data. Systematic numerical tests show that our method outperforms baseline methods that regress on trajectories recovered via label matching, tolerating large observation time steps. We establish the convergence of parametric estimators as the sample size increases, providing a theoretical foundation for the proposed approach.