A Universal Vibe? Finding and Controlling Language-Agnostic Informal Register with SAEs
arXiv cs.CL / 3/30/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper investigates whether multilingual LLMs represent culture-specific pragmatic registers (e.g., slang) as language-agnostic abstractions or as separate language-specific memorization using Sparse Autoencoders (SAEs) on Gemma-2-9B-IT.
- It introduces a new probing dataset designed to disentangle pragmatic register from lexical sensitivity by using polysemous terms that appear in both literal and informal contexts.
- The authors find a small but highly robust cross-linguistic “core” of informal-register features that forms a geometrically coherent informal-register subspace, becoming clearer in deeper model layers.
- Using activation steering, they show causal shifts in output formality across all tested source languages and report zero-shot transfer to six unseen languages across different families and scripts.
- The results are presented as first mechanistic evidence that multilingual LLMs encode informal register as a portable pragmatic abstraction rather than only surface-level heuristics.




