The Randomness Floor: Measuring Intrinsic Non-Randomness in Language Model Token Distributions

arXiv cs.CL / 4/28/2026

💬 OpinionModels & Research

共有:

Key Points

The paper proposes Entropic Deviation (ED), a normalized KL-divergence metric comparing a language model’s token distribution to the uniform distribution to quantify intrinsic non-randomness.
Across 31,200 generations over seven models, ED remains substantial even under semantically neutral prompts, suggesting much of the observed non-randomness is embedded in the learned weights rather than being induced by context.
Transformer families such as Gemma, Llama, and Qwen show convergent ED values despite differences in training data and vocabularies, indicating a structural property of pretrained transformers.
In contrast, the state space model (Mamba2) exhibits a different “regime” with about twice the ED, lower within-sequence variance, and strong temperature sensitivity, while transformers are comparatively insensitive.
Cross-lingual tests with Qwen-32B show ED-related gradients that are stable across five languages and persist even when comparing languages that share identical tokeniser subsets, implying language modulates the randomness bound beyond tokenization effects.

Abstract

Language models cannot be random. This paper introduces Entropic Deviation (ED), the normalised KL divergence between a model's token distribution and the uniform distribution, and measures it systematically across 31,200 generations spanning seven models, two architectures (transformer and state space), nine prompt categories, three temperatures, and five languages. Under semantically neutral prompts (empty strings, random characters, nonsense syllables) transformers still exhibit ED of approximately 0.30, meaning that 88-93% of the non-randomness observed under semantic prompts is intrinsic to the learned weights rather than induced by context. Three transformer families (Gemma, Llama, Qwen) converge on nearly identical ED values despite different training data and vocabularies. A state space model (Mamba2) reveals a qualitatively different regime: twice the ED, three times lower within-sequence variance, and massive sensitivity to temperature (r = -0.78) where transformers are nearly immune (r < 0.05). Cross-lingual experiments with Qwen-32B show a stable gradient across five languages (English, Japanese, Chinese, Polish, Arabic) that does not correlate with token fertility and persists when two languages sharing an identical tokeniser subset are compared. These findings establish a structural lower bound on randomness in pretrained language models, characterise how this bound differs across architectures, and demonstrate that language itself modulates the bound independently of tokenisation.

LLMs will be a commodity

Reddit r/artificial

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

Reddit r/LocalLLaMA

Dex lands $5.3M to grow its AI-driven talent matching platform

Tech.eu

AI Voice Agents in Production: What Actually Works in 2026

Dev.to

How we built a browser-based AI Pathology platform

Dev.to

The Randomness Floor: Measuring Intrinsic Non-Randomness in Language Model Token Distributions

Key Points

Abstract

Related Articles

LLMs will be a commodity

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

Dex lands $5.3M to grow its AI-driven talent matching platform

AI Voice Agents in Production: What Actually Works in 2026

How we built a browser-based AI Pathology platform

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer