Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries

arXiv cs.CL / 3/31/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper finds that large language models exhibit a form of categorical perception in their hidden states when processing Arabic numerals, showing geometric “structural warping” at digit-count boundaries (specifically at transitions like 10 and 100).
  • Across six models from five different architecture families, a CP-additive representational geometry model fits better than a purely continuous model at 100% of primary layers tested, indicating the effect is robust within LLM internal representations.
  • The boundary-specific warping is absent at non-boundary control positions and is also absent in the temperature domain, where there is no tokenization discontinuity for linguistic categories such as “hot/cold.”
  • Two distinct signatures are reported: “classic CP” models both internalize the category distinction and show warping, while “structural CP” models show the warping at the boundary even though they cannot explicitly report the category distinction.
  • The authors conclude that structural input-format discontinuities alone can induce categorical-perception-like geometry in LLMs, independent of explicit semantic category knowledge and stable across boundaries and architectural families.

Abstract

Categorical perception (CP) -- enhanced discriminability at category boundaries -- is among the most studied phenomena in perceptual psychology. This paper reports that analogous geometric warping occurs in the hidden-state representations of large language models (LLMs) processing Arabic numerals. Using representational similarity analysis across six models from five architecture families, the study finds that a CP-additive model (log-distance plus a boundary boost) fits the representational geometry better than a purely continuous model at 100% of primary layers in every model tested. The effect is specific to structurally defined boundaries (digit-count transitions at 10 and 100), absent at non-boundary control positions, and absent in the temperature domain where linguistic categories (hot/cold) lack a tokenisation discontinuity. Two qualitatively distinct signatures emerge: "classic CP" (Gemma, Qwen), where models both categorise explicitly and show geometric warping, and "structural CP" (Llama, Mistral, Phi), where geometry warps at the boundary but models cannot report the category distinction. This dissociation is stable across boundaries and is a property of the architecture, not the stimulus. Structural input-format discontinuities are sufficient to produce categorical perception geometry in LLMs, independently of explicit semantic category knowledge.