Align Your Structures: Generating Trajectories with Structure Pretraining for Molecular Dynamics

arXiv cs.LG / 4/7/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a framework for generating molecular dynamics (MD) trajectories with deep generative models by using structure pretraining to address limited MD trajectory data and the complexity of high-dimensional MD distributions.
  • It trains a diffusion-based structure generation model on large-scale conformer datasets and adds an interpolator module trained on MD trajectory data to enforce temporal consistency across generated structures.
  • The method decomposes MD trajectory generation into two more manageable subproblems—structural generation and temporal alignment—by leveraging abundant structural information while using MD data specifically for temporal constraints.
  • Experiments on QM9 and DRUGS evaluate unconditional generation, forward simulation, and interpolation, showing improvements across geometric, dynamical, and energetic accuracy metrics.
  • The framework is further extended to tetrapeptide and protein monomer systems, indicating broader applicability beyond small molecules.

Abstract

Generating molecular dynamics (MD) trajectories using deep generative models has attracted increasing attention, yet remains inherently challenging due to the limited availability of MD data and the complexities involved in modeling high-dimensional MD distributions. To overcome these challenges, we propose a novel framework that leverages structure pretraining for MD trajectory generation. Specifically, we first train a diffusion-based structure generation model on a large-scale conformer dataset, on top of which we introduce an interpolator module trained on MD trajectory data, designed to enforce temporal consistency among generated structures. Our approach effectively harnesses abundant structural data to mitigate the scarcity of MD trajectory data and effectively decomposes the intricate MD modeling task into two manageable subproblems: structural generation and temporal alignment. We comprehensively evaluate our method on the QM9 and DRUGS small-molecule datasets across unconditional generation, forward simulation, and interpolation tasks, and further extend our framework and analysis to tetrapeptide and protein monomer systems. Experimental results confirm that our approach excels in generating chemically realistic MD trajectories, as evidenced by remarkable improvements of accuracy in geometric, dynamical, and energetic measurements.