LLMPhy: Parameter-Identifiable Physical Reasoning Combining Large Language Models and Physics Engines

arXiv cs.RO / 4/27/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

共有:

Key Points

LLMPhy is an optimization framework that combines large language models with physics engines to perform physical reasoning while explicitly addressing the key challenge of parameter identification (e.g., mass and friction).
The method builds digital twins by splitting the task into two parts: continuous physical-parameter estimation and discrete scene-layout estimation, both refined through iterative LLM-generated program execution and physics-simulation feedback.
LLMPhy uses reconstruction error from the physics engine as a learning signal to improve latent parameter estimates, effectively bridging “textbook” physical knowledge in LLMs with realistic world models in simulators.
The paper introduces three new zero-shot datasets focused on parameter identifiability, since existing benchmarks often do not evaluate this aspect.
Experiments report that LLMPhy achieves state-of-the-art performance, recovers physical parameters more accurately, and converges more reliably than prior black-box approaches.

Abstract

Most learning-based approaches to complex physical reasoning sidestep the crucial problem of parameter identification (e.g., mass, friction) that governs scene dynamics, despite its importance in real-world applications such as collision avoidance and robotic manipulation. In this paper, we present LLMPhy, a black-box optimization framework that integrates large language models (LLMs) with physics simulators for physical reasoning. The core insight of LLMPhy is to bridge the textbook physical knowledge embedded in LLMs with the world models implemented in modern physics engines, enabling the construction of digital twins of input scenes via latent parameter estimation. Specifically, LLMPhy decomposes digital twin construction into two subproblems: (i) a continuous problem of estimating physical parameters and (ii) a discrete problem of estimating scene layout. For each subproblem, LLMPhy iteratively prompts the LLM to generate computer programs encoding parameter estimates, executes them in the physics engine to reconstruct the scene, and uses the resulting reconstruction error as feedback to refine the LLM's predictions. As existing physical reasoning benchmarks rarely account for parameter identifiability, we introduce three new datasets designed to evaluate physical reasoning in zero-shot settings. Our results show that LLMPhy achieves state-of-the-art performance on our tasks, recovers physical parameters more accurately, and converges more reliably than prior black-box methods. See the LLMPhy project page for details: https://www.merl.com/research/highlights/LLMPhy

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Everyone Wants AI Agents. Fewer Teams Are Ready for the Messy Business Context Behind Them

Dev.to

How to Build Traceable and Evaluated LLM Workflows Using Promptflow, Prompty, and OpenAI

MarkTechPost

AI 编程工具对比 2026：Claude Code vs Cursor vs Gemini CLI vs Codex

Dev.to

How I Improved My YouTube Shorts and Podcast Audio Workflow with AI Tools

Dev.to

LLMPhy: Parameter-Identifiable Physical Reasoning Combining Large Language Models and Physics Engines

Key Points

Abstract

Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Everyone Wants AI Agents. Fewer Teams Are Ready for the Messy Business Context Behind Them

How to Build Traceable and Evaluated LLM Workflows Using Promptflow, Prompty, and OpenAI

AI 编程工具对比 2026：Claude Code vs Cursor vs Gemini CLI vs Codex

How I Improved My YouTube Shorts and Podcast Audio Workflow with AI Tools

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer