Same Geometry, Opposite Noise: Transformer Magnitude Representations Lack Scalar Variability
arXiv cs.CL / 4/7/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper tests whether transformer language models exhibit “scalar variability,” where representational noise scales proportionally with magnitude to yield a constant coefficient of variation seen in biological magnitude systems.
- Across 26 numerical magnitudes in three 7–8B models (Llama-3-8B-Instruct, Mistral-7B-Instruct-v0.3, and Llama-3-8B-Base), the authors find an anti-scalar pattern: representational variability decreases as magnitude increases (scaling exponent alpha ≈ -0.19).
- The negative scaling persists under multiple checks, including full-dimensional space analysis (alpha ≈ -0.04) and sentence-identity correction (alpha ≈ -0.007), with no primary layers showing alpha > 0 among most layers (0/16).
- The anti-scalar effect is reported to be 3–5× stronger along the magnitude axis than in orthogonal dimensions, and corpus frequency substantially predicts per-magnitude variability (rho = 0.84).
- The authors conclude that standard distributional learning in transformers reproduces some log-compressive magnitude geometry but does not produce the biological constant-CV noise signature.
Related Articles

Black Hat Asia
AI Business
OpenAI vs Anthropic IPO Finances Compared — The 2026 AI Mega IPO Race
Dev.to
Prompt Engineering in 2026: Advanced Techniques for Better AI Results
Dev.to
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to
Ace Step 1.5 XL Models Available
Reddit r/LocalLLaMA