From Syntax to Emotion: A Mechanistic Analysis of Emotion Inference in LLMs
arXiv cs.CL / 4/29/2026
📰 NewsModels & Research
Key Points
- The paper uses sparse autoencoders to probe how LLMs internally represent emotion recognition, finding a consistent three-phase information flow where emotion-relevant features appear only in the final phase.
- It shows that emotion representations are built from both shared features across emotions and emotion-specific features, with different emotions relying on different causal mechanisms.
- Phase-stratified causal tracing identifies a small set of influential features that drive emotion predictions, and the number and causal impact of these features vary by emotion—disgust appears more weakly and diffusely represented.
- The authors propose a causal feature steering method that is interpretable and data-efficient, improving emotion recognition performance across multiple models while largely preserving language modeling ability, and the gains generalize across multiple emotion datasets.
- Overall, the work offers a systematic mechanistic account of emotion inference in LLMs and a practical, controllable intervention for boosting performance in emotionally sensitive applications.
Related Articles

How I Use AI Agents to Maintain a Living Knowledge Base for My Team
Dev.to
IK_LLAMA now supports Qwen3.5 MTP Support :O
Reddit r/LocalLLaMA
OpenAI models, Codex, and Managed Agents come to AWS
Dev.to

Automatic Error Recovery in AI Agent Networks
Dev.to
AeroJAX: JAX-native CFD, differentiable end-to-end. ~560 FPS at 128x128 on CPU [P]
Reddit r/MachineLearning