Disentangling Mathematical Reasoning in LLMs: A Methodological Investigation of Internal Mechanisms

arXiv cs.CL / 4/20/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper investigates how LLMs internally process arithmetic reasoning by tracing next-token prediction construction across layers during execution.
  • It finds that models identify arithmetic tasks early, but producing correct arithmetic results depends on processing in the final layers.
  • Successful arithmetic models show a distinct “division of labor” where attention mainly propagates relevant input information and MLP modules aggregate it.
  • The authors suggest that strong models handle harder arithmetic in a more functional “reasoning-like” way rather than relying solely on factual recall.

Abstract

Large language models (LLMs) have demonstrated impressive capabilities, yet their internal mechanisms for handling reasoning-intensive tasks remain underexplored. To advance the understanding of model-internal processing mechanisms, we present an investigation of how LLMs perform arithmetic operations by examining internal mechanisms during task execution. Using early decoding, we trace how next-token predictions are constructed across layers. Our experiments reveal that while the models recognize arithmetic tasks early, correct result generation occurs only in the final layers. Notably, models proficient in arithmetic exhibit a clear division of labor between attention and MLP modules, where attention propagates input information and MLP modules aggregate it. This division is absent in less proficient models. Furthermore, successful models appear to process more challenging arithmetic tasks functionally, suggesting reasoning capabilities beyond factual recall.

Disentangling Mathematical Reasoning in LLMs: A Methodological Investigation of Internal Mechanisms | AI Navigate