An Information-Geometric Framework for Stability Analysis of Large Language Models under Entropic Stress

arXiv cs.AI / 4/28/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that LLM reliability in high-stakes deployments cannot be fully captured by aggregate accuracy and proposes a new evaluation approach based on thermodynamic/ information-geometric ideas.
  • It introduces a composite “stability score” that combines task utility, entropy (external uncertainty), and two internal structural proxies (internal integration and aligned reflective capacity) to model how disorder affects behavior.
  • Using the IST-20 benchmarking protocol and metadata, the authors analyze 80 model-scenario observations across four contemporary LLMs and find that the full formulation yields higher stability scores than a reduced utility–entropy baseline.
  • The reported average improvement is 0.0299 (95% CI: 0.0247–0.0351), and the benefit is larger under higher-entropy conditions, indicating a stronger, nonlinear-like attenuation of uncertainty.
  • The work is positioned as an interpretable abstraction for connecting uncertainty, performance, and internal structure, intended to complement existing safety, reliability, and governance discussions rather than claim physical law or a complete theory of ethics.

Abstract

As large language models (LLMs) are increasingly deployed in high-stakes and operational settings, evaluation strategies based solely on aggregate accuracy are often insucient to characterize system reliability. This study proposes a thermodynamic inspired modeling framework for analyzing the stability of LLM outputs under conditions of uncertainty and perturbation. The framework introduces a composite stability score that integrates task utility, entropy as a measure of external uncertainty, and two internal structural proxies: internal integration and aligned reective capacity. Rather than interpreting these quantities as physical variables, the formulation is intended as an interpretable abstraction that captures how internal structure may modulate the impact of disorder on model behavior. Using the IST-20 benchmarking protocol and associated metadata, we analyze 80 modelscenario observations across four contemporary LLMs. The proposed formulation consistently yields higher stability scores than a reduced utilityentropy baseline, with a mean improvement of 0.0299 (95% CI: 0.02470.0351). The observed gain is more pronounced under higher entropy conditions, suggesting that the framework captures a form of nonlinear attenuation of uncertainty. We do not claim a fundamental physical law or a complete theory of machine ethics. Instead, the contribution of this work is a compact and interpretable modeling perspective that connects uncertainty, performance, and internal structure within a unied evaluation lens. The framework is intended to complement existing benchmarking approaches and to support ongoing discussions in AI safety, reliability, and governance.