Improving Molecular Force Fields with Minimal Temporal Information

arXiv cs.LG / 4/23/2026

💬 OpinionModels & Research

共有:

Key Points

The paper addresses a core AI-for-Science challenge: accurately predicting molecular energies and forces using neural networks trained on atomic configurations.
It argues that most models ignore an important property of training data generation—molecular dynamics (MD) trajectories—which contain time-ordered fluctuations that explore the potential energy surface.
The authors propose FRAMES, a new training strategy that adds an auxiliary loss to exploit temporal relationships within MD trajectories.
Experiments show that using only minimal temporal context—pairs of two consecutive frames—can yield the best performance, and longer sequences may add redundancy and even reduce accuracy.
On MD17 and ISO17 benchmarks, FRAMES significantly outperforms an Equiformer baseline with highly competitive energy and force prediction accuracy, suggesting that more temporal data is not always beneficial for learning physical priors.

Abstract

Accurate prediction of energy and forces for 3D molecular systems is one of fundamental challenges at the core of AI for Science applications. Many powerful and data-efficient neural networks predict molecular energies and forces from single atomic configurations. However, one crucial aspect of the data generation process is rarely considered while learning these models i.e. Molecular Dynamics (MD) simulation. MD simulations generate time-ordered trajectories of atomic positions that fluctuate in energy and explore regions of the potential energy surface (e.g., under standard NVE/NVT ensembles), rather than being constructed to steadily lower the potential energy toward a minimum as in geometry relaxations. This work explores a novel way to leverage MD data, when available, to improve the performance of such predictors. We introduce a novel training strategy called FRAMES, that use an auxiliary loss function for exploiting the temporal relationships within MD trajectories. Counter-intuitively, on two atomistic benchmarks and a synthetic system we observe that minimal temporal information, captured by pairs of just two consecutive frames, is often sufficient to obtain the best performance, while adding longer trajectory sequences can introduce redundancy and degrade performance. On the widely used MD17 and ISO17 benchmarks, FRAMES significantly outperforms its Equiformer baseline, achieving highly competitive results in both energy and force accuracy. Our work not only presents a novel training strategy which improves the accuracy of the model, but also provides evidence that for distilling physical priors of atomic systems, more temporal data is not always better.