From Data to Theory: Autonomous Large Language Model Agents for Materials Science

arXiv cs.AI / 4/23/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • Researchers propose an autonomous LLM agent that can perform end-to-end, data-driven materials theory development, including selecting equation forms, writing/running code, and validating fits to data without human intervention.
  • The framework combines step-by-step reasoning with expert-provided tools while maintaining a transparent decision log, enabling iterative adjustment of the agent’s approach.
  • On established materials relationships like the Hall-Petch equation and Paris law, the agent reliably identifies governing equations and makes predictions on new datasets.
  • For more specialized relationships (e.g., Kuhn’s equation for the HOMO-LUMO gap), results depend more on the base model—GPT-5 recovers the correct equation more effectively—yet the agent can still produce incorrect or inconsistent equations even with seemingly strong numerical fits.
  • The agent can also propose new predictive relationships (such as a strain-dependent law for the HOMO-LUMO gap), but the study emphasizes the continued need for careful validation to ensure scientific correctness.

Abstract

We present an autonomous large language model (LLM) agent for end-to-end, data-driven materials theory development. The model can choose an equation form, generate and run its own code, and test how well the theory matches the data without human intervention. The framework combines step-by-step reasoning with expert-supplied tools, allowing the agent to adjust its approach as needed while keeping a clear record of its decisions. For well-established materials relationships such as the Hall-Petch equation and Paris law, the agent correctly identifies the governing equation and makes reliable predictions on new datasets. For more specialized relationships, such as Kuhn's equation for the HOMO-LUMO gap of conjugated molecules as a function of length, performance depends more strongly on the underlying model, with GPT-5 showing better recovery of the correct equation. Beyond known theories, the agent can also suggest new predictive relationships, illustrated here by a strain-dependent law for changes in the HOMO-LUMO gap. At the same time, the results show that careful validation remains essential, because the agent can still return incorrect, incomplete, or inconsistent equations even when the numerical fit appears strong. Overall, these results highlight both the promise and the current limitations of autonomous LLM agents for AI-assisted scientific modeling and discovery.