Shared Emotion Geometry Across Small Language Models: A Cross-Architecture Study of Representation, Behavior, and Methodological Confounds
arXiv cs.CL / 4/14/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper analyzes 21-emotion vector representations extracted from twelve small language models across six architectures (base vs instruct) using a unified comprehension-mode pipeline at fp16 precision, then compares emotion “geometries” via representational similarity (cosine RDMs, Spearman correlations).
- Results show a striking universality: five mature model families (Qwen 2.5, SmolLM2, Llama 3.2, Mistral 7B, Llama 3.1) share nearly identical 21-emotion geometry, with pairwise RDM Spearman correlations typically in the 0.74–0.92 range.
- The universality holds even when models differ strongly in behavior (e.g., Qwen 2.5 vs Llama 3.2 differ on MTI compliance facets but still show highly similar emotion representations, rho ≈ 0.81), implying behavioral differences may occur “above” a shared emotional representation layer.
- The outlier Gemma-3 1B base shows extreme representation issues (residual-stream anisotropy ~0.997) and gets restructured by RLHF across geometric descriptors, while within-mature families base vs instruct RDMs correlate very highly (e.g., Mistral 7B v0.3 rho ≈ 0.985), suggesting RLHF mainly reshapes representations that are not yet organized.
- Methodologically, the authors argue prior “method effect” conclusions actually combine multiple factors—method-dependent dissociation, sub-parameter sensitivity within generation, fp16 vs INT8 precision effects, and cross-experiment bias—so single cross-study similarity numbers (single rho) may be misleading without decomposition.



