Geometric Properties of the Voronoi Tessellation in Latent Semantic Manifolds of Large Language Models

arXiv cs.LG / 4/9/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper analyzes how discrete-token LLMs induce a Voronoi tessellation in their continuous latent representation space, studying this empirically on Qwen3.5-4B-Base.
  • It validates a linear scaling law for the expressibility gap using float32 margin recomputation to remove bfloat16 quantization artifacts, reporting an extremely strong fit (R² = 0.9997).
  • The study identifies a mid-layer “geometric ambiguity” regime where margin geometry is anti-correlated with cross-entropy (layers 24–28), before transitioning to strong alignment at the final layer.
  • It proposes margin refinement procedures (MRP) as short post-hoc optimization runs that reshape the model’s Voronoi tessellation without full retraining, comparing margin maximization vs Fisher information distance maximization.
  • Fisher-based MRP shows a higher-robustness improvement ceiling (~16,300 correctable positions per 256K evaluated) while maintaining invariant downstream benchmarks, with benefits concentrated in high-frequency structural tokens.

Abstract

Language models operate on discrete tokens but compute in continuous vector spaces, inducing a Voronoi tessellation over the representation manifold. We study this tessellation empirically on Qwen3.5-4B-Base, making two contributions. First, using float32 margin recomputation to resolve bfloat16 quantization artifacts, we validate Mabrok's (2026) linear scaling law of the expressibility gap with R^2 = 0.9997 - the strongest confirmation to date - and identify a mid-layer geometric ambiguity regime where margin geometry is anti-correlated with cross-entropy (layers 24-28, \rho = -0.29) before crystallizing into alignment at the final layer (\rho = 0.836). Second, we show that the Voronoi tessellation of a converged model is reshapable through margin refinement procedures (MRP): short post-hoc optimization runs that widen token-decision margins without retraining. We compare direct margin maximization against Fisher information distance maximization across a dose-response sweep. Both methods find the same ceiling of ~16,300 correctable positions per 256K evaluated, but differ critically in collateral damage. Margin maximization damage escalates with intervention strength until corrections are overwhelmed. Fisher damage remains constant at ~5,300 positions across the validated range (\lambda = 0.15-0.6), achieving +28% median margin improvement at \lambda = 0.6 with invariant downstream benchmarks - a geometric reorganization that compresses the expressibility gap while preserving its scaling law. However, frequency and token-class audits reveal that gains concentrate in high-frequency structural tokens (84% of net corrections at \lambda = 0.6), with content and entity-like contributions shrinking at higher \lambda. Fisher MRP is therefore a viable geometric polishing tool whose practical ceiling is set not by aggregate damage but by the uniformity of token-level benefit.