Bridging the Simulation-to-Experiment Gap with Generative Models using Adversarial Distribution Alignment

arXiv cs.LG / 4/2/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses the simulation-to-experiment gap by proposing a distribution-alignment framework that links generative models trained on simulated data to experimental observations that only partially reveal the system state.
  • It introduces “Adversarial Distribution Alignment (ADA)” to align a generative model of atomic positions, initially trained on a simulated Boltzmann distribution, with the experimental observation distribution.
  • The authors prove that ADA can recover the target observable distribution even when multiple observables are present and may be correlated.
  • The approach is presented as domain-agnostic, but is demonstrated in physical-science contexts including synthetic, molecular, and experimental protein data, showing alignment across diverse observables.
  • The work provides publicly available code, supporting replication and potential adoption in simulation-to-experiment modeling workflows.

Abstract

A fundamental challenge in science and engineering is the simulation-to-experiment gap. While we often possess prior knowledge of physical laws, these physical laws can be too difficult to solve exactly for complex systems. Such systems are commonly modeled using simulators, which impose computational approximations. Meanwhile, experimental measurements more faithfully represent the real world, but experimental data typically consists of observations that only partially reflect the system's full underlying state. We propose a data-driven distribution alignment framework that bridges this simulation-to-experiment gap by pre-training a generative model on fully observed (but imperfect) simulation data, then aligning it with partial (but real) observations of experimental data. While our method is domain-agnostic, we ground our approach in the physical sciences by introducing Adversarial Distribution Alignment (ADA). This method aligns a generative model of atomic positions -- initially trained on a simulated Boltzmann distribution -- with the distribution of experimental observations. We prove that our method recovers the target observable distribution, even with multiple, potentially correlated observables. We also empirically validate our framework on synthetic, molecular, and experimental protein data, demonstrating that it can align generative models with diverse observables. Our code is available at https://kaityrusnelson.com/ada/.