Generative structure search for efficient and diverse discovery of molecular and crystal structures

arXiv cs.AI / 5/1/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses the challenge of predicting stable and metastable molecular and crystal structures by searching expensive high-dimensional energy landscapes.
  • It proposes Generative Structure Search (GSS), a unified framework that connects diffusion-based generation and Random Structure Search through a common sampling process using learned score fields and physical forces.
  • By coupling learned data priors with energy-guided exploration of local minima, GSS aims to both accelerate sampling and improve coverage of rare but physically relevant minima.
  • Experiments on molecular and crystalline systems show that GSS can recover diverse metastable structures with over tenfold lower sampling cost than RSS while maintaining effectiveness for compositions outside the training data distribution.
  • Overall, the work presents a physically grounded generative search strategy that can discover structures beyond what data-driven sampling alone can reach.

Abstract

Predicting stable and metastable structures is central to molecular and materials discovery, but remains limited by the cost of searching high-dimensional energy landscapes. Deep generative models offer efficient structure sampling, yet their outputs remain shaped by training data and can underexplore minima that are rare but physically relevant. We introduce generative structure search (GSS), a unified framework that formulates diffusion-based generation and random structure search (RSS) as limiting regimes of a common sampling process driven by learned score fields and physical forces. Coupling these drivers lets GSS use data priors to accelerate sampling while retaining energy-guided exploration of local minima. Across molecular and crystalline systems, GSS recovers diverse metastable structures with more than tenfold lower sampling cost than RSS for broad coverage and remains effective for compositions outside the training distribution. The results establish a physically grounded generative search strategy for discovering structures beyond the reach of data-driven sampling alone.