Multilingual Language Models Encode Script Over Linguistic Structure

arXiv cs.LG / 4/8/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper analyzes how multilingual language models form internal representations, testing whether they are organized more by abstract language identity/typology or by surface-form cues such as orthography.
  • Using the Language Activation Probability Entropy (LAPE) metric and Sparse Autoencoders on compact distilled versions of Llama-3.2-1B and Gemma-2-2B, the authors find that orthography dominates representation structure.
  • Romanization leads to near-disjoint internal representations that do not align well with either native-script inputs or with English, indicating strong sensitivity to surface-form changes.
  • Word-order shuffling has limited impact on which internal “language-associated units” are activated, suggesting typological order is not the primary driver of unit identity.
  • The study finds that typological information becomes more accessible in deeper layers, and causal interventions show generation depends more on units invariant to surface-form perturbations than on units selected purely by typological alignment.

Abstract

Multilingual language models (LMs) organize representations for typologically and orthographically diverse languages into a shared parameter space, yet the nature of this internal organization remains elusive. In this work, we investigate which linguistic properties - abstract language identity or surface-form cues - shape multilingual representations. Focusing on compact, distilled models where representational trade-offs are explicit, we analyze language-associated units in Llama-3.2-1B and Gemma-2-2B using the Language Activation Probability Entropy (LAPE) metric, and further decompose activations with Sparse Autoencoders. We find that these units are strongly conditioned on orthography: romanization induces near-disjoint representations that align with neither native-script inputs nor English, while word-order shuffling has limited effect on unit identity. Probing shows that typological structure becomes increasingly accessible in deeper layers, while causal interventions indicate that generation is most sensitive to units that are invariant to surface-form perturbations rather than to units identified by typological alignment alone. Overall, our results suggest that multilingual LMs organize representations around surface form, with linguistic abstraction emerging gradually without collapsing into a unified interlingua.