Large language models show fragile cognitive reasoning about human emotions
arXiv cs.CL / 3/16/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The CoRE benchmark is introduced to probe implicit cognitive structures in LLMs for interpreting emotionally charged situations based on cognitive appraisal theory.
- The study finds LLMs capture systematic relations between cognitive appraisals and emotions but show misalignment with human judgments and instability across contexts.
- The evaluation includes alignment with human patterns, internal consistency, cross-model generalization, and robustness to contextual variation.
- The results highlight fragility in LLM-based emotion reasoning and have implications for affective computing research and how we evaluate AI emotion understanding.
Related Articles
ADICはどの種類の革新なのか ―― ドリフト監査デモで見る「事後説明」から「通過条件」への移行**
Qiita
Complete Guide: How To Make Money With Ai
Dev.to
Built a small free iOS app to reduce LLM answer uncertainty with multiple models
Dev.to
Without Valid Data, AI Transformation Is Flying Blind – Why We Need to “Grasp” Work Again
Dev.to
How We Used Hindsight Memory to Build an AI That Knows Your Weaknesses
Dev.to