PLDR-LLMs Reason At Self-Organized Criticality

arXiv cs.LG / 3/26/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that PLDR-LLMs pretrained under self-organized criticality can perform reasoning during inference, with deductive outputs showing behavior analogous to second-order phase transitions.
  • It claims that at criticality the correlation length effectively diverges, and deductive outputs reach a metastable steady state that supports generalization and reasoning.
  • The authors propose that this steady-state behavior corresponds to learning representations akin to scaling functions, universality classes, and renormalization-group concepts from the training data.
  • They introduce an “order parameter” derived from global statistics of the model’s deductive-output parameters at inference and report that reasoning is strongest when the order parameter is near zero at criticality.
  • The study concludes that reasoning capability can be quantified using global model parameter values at steady state, without relying on curated benchmark evaluations for inductive measures of reasoning and comprehension.

Abstract

We show that PLDR-LLMs pretrained at self-organized criticality exhibit reasoning at inference time. The characteristics of PLDR-LLM deductive outputs at criticality is similar to second-order phase transitions. At criticality, the correlation length diverges, and the deductive outputs attain a metastable steady state. The steady state behaviour suggests that deductive outputs learn representations equivalent to scaling functions, universality classes and renormalization groups from the training dataset, leading to generalization and reasoning capabilities in the process. We can then define an order parameter from the global statistics of the model's deductive output parameters at inference. The reasoning capabilities of a PLDR-LLM is better when its order parameter is close to zero at criticality. This observation is supported by the benchmark scores of the models trained at near-criticality and sub-criticality. Our results provide a self-contained explanation on how reasoning manifests in large language models, and the ability to reason can be quantified solely from global model parameter values of the deductive outputs at steady state, without any need for evaluation of curated benchmark datasets through inductive output for reasoning and comprehension.