LLMPhy: Parameter-Identifiable Physical Reasoning Combining Large Language Models and Physics Engines

arXiv cs.RO / 4/27/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • LLMPhy is an optimization framework that combines large language models with physics engines to perform physical reasoning while explicitly addressing the key challenge of parameter identification (e.g., mass and friction).
  • The method builds digital twins by splitting the task into two parts: continuous physical-parameter estimation and discrete scene-layout estimation, both refined through iterative LLM-generated program execution and physics-simulation feedback.
  • LLMPhy uses reconstruction error from the physics engine as a learning signal to improve latent parameter estimates, effectively bridging “textbook” physical knowledge in LLMs with realistic world models in simulators.
  • The paper introduces three new zero-shot datasets focused on parameter identifiability, since existing benchmarks often do not evaluate this aspect.
  • Experiments report that LLMPhy achieves state-of-the-art performance, recovers physical parameters more accurately, and converges more reliably than prior black-box approaches.

Abstract

Most learning-based approaches to complex physical reasoning sidestep the crucial problem of parameter identification (e.g., mass, friction) that governs scene dynamics, despite its importance in real-world applications such as collision avoidance and robotic manipulation. In this paper, we present LLMPhy, a black-box optimization framework that integrates large language models (LLMs) with physics simulators for physical reasoning. The core insight of LLMPhy is to bridge the textbook physical knowledge embedded in LLMs with the world models implemented in modern physics engines, enabling the construction of digital twins of input scenes via latent parameter estimation. Specifically, LLMPhy decomposes digital twin construction into two subproblems: (i) a continuous problem of estimating physical parameters and (ii) a discrete problem of estimating scene layout. For each subproblem, LLMPhy iteratively prompts the LLM to generate computer programs encoding parameter estimates, executes them in the physics engine to reconstruct the scene, and uses the resulting reconstruction error as feedback to refine the LLM's predictions. As existing physical reasoning benchmarks rarely account for parameter identifiability, we introduce three new datasets designed to evaluate physical reasoning in zero-shot settings. Our results show that LLMPhy achieves state-of-the-art performance on our tasks, recovers physical parameters more accurately, and converges more reliably than prior black-box methods. See the LLMPhy project page for details: https://www.merl.com/research/highlights/LLMPhy

LLMPhy: Parameter-Identifiable Physical Reasoning Combining Large Language Models and Physics Engines | AI Navigate