Can Large Language Models Reason and Optimize Under Constraints?

arXiv cs.AI / 3/25/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper evaluates whether large language models can perform reasoning and constrained optimization on the Optimal Power Flow (OPF) problem, which has real physical/operational limits.
  • It proposes a rigorous benchmark that tests multiple core skills needed for constraint solving, including structured input handling, arithmetic, reasoning, and constrained optimization.
  • Results show that state-of-the-art LLMs fail in most tasks, and even reasoning-focused LLMs struggle significantly in the hardest constraint-heavy settings.
  • The authors identify key gaps in LLMs’ ability to execute structured reasoning under constraints and frame the benchmark as a testing ground for future LLM assistants aimed at real power-grid optimization.

Abstract

Large Language Models (LLMs) have demonstrated great capabilities across diverse natural language tasks; yet their ability to solve abstraction and optimization problems with constraints remains scarcely explored. In this paper, we investigate whether LLMs can reason and optimize under the physical and operational constraints of Optimal Power Flow (OPF) problem. We introduce a challenging evaluation setup that requires a set of fundamental skills such as reasoning, structured input handling, arithmetic, and constrained optimization. Our evaluation reveals that SoTA LLMs fail in most of the tasks, and that reasoning LLMs still fail in the most complex settings. Our findings highlight critical gaps in LLMs' ability to handle structured reasoning under constraints, and this work provides a rigorous testing environment for developing more capable LLM assistants that can tackle real-world power grid optimization problems.