AI Navigate

CMHL: Contrastive Multi-Head Learning for Emotionally Consistent Text Classification

arXiv cs.CL / 3/17/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • CMHL is a single-model architecture that explicitly models the emotional structure through multi-task learning (predicting primary emotions, valence, and intensity), psychologically-grounded supervision from Russell's circumplex model, and a novel contrastive contradiction loss that enforces emotional consistency.
  • With 125M parameters, CMHL outperforms 56x larger LLMs and ensembles, achieving a new state-of-the-art F1 score of 93.75% on the dair-ai Emotion dataset.
  • The approach demonstrates cross-domain generalization, outperforming domain-specific models on SWMH (Reddit Suicide Watch and Mental Health Collection) with F1 around 72.50% and recall around 73.30%, indicating enhanced sensitivity to mental health distress.
  • The work argues that architectural intelligence and embedding psychological priors, rather than sheer parameter count, drive progress in emotion classification, offering an efficient, interpretable, and clinically relevant paradigm for affective computing.

Abstract

Textual Emotion Classification (TEC) is one of the most difficult NLP tasks. State of the art approaches rely on Large language models (LLMs) and multi-model ensembles. In this study, we challenge the assumption that larger scale or more complex models are necessary for improved performance. In order to improve logical consistency, We introduce CMHL, a novel single-model architecture that explicitly models the logical structure of emotions through three key innovations: (1) multi-task learning that jointly predicts primary emotions, valence, and intensity, (2) psychologically-grounded auxiliary supervision derived from Russell's circumplex model, and (3) a novel contrastive contradiction loss that enforces emotional consistency by penalizing mutually incompatible predictions (e.g., simultaneous high confidence in joy and anger). With just 125M parameters, our model outperforms 56x larger LLMs and sLM ensembles with a new state-of-the-art F1 score of 93.75\% compared to (86.13\%-93.2\%) on the dair-ai Emotion dataset. We further show cross domain generalization on the Reddit Suicide Watch and Mental Health Collection dataset (SWMH), outperforming domain-specific models like MentalBERT and MentalRoBERTa with an F1 score of 72.50\% compared to (68.16\%-72.16\%) + a 73.30\% recall compared to (67.05\%-70.89\%) that translates to enhanced sensitivity for detecting mental health distress. Our work establishes that architectural intelligence (not parameter count) drives progress in TEC. By embedding psychological priors and explicit consistency constraints, a well-designed single model can outperform both massive LLMs and complex ensembles, offering a efficient, interpretable, and clinically-relevant paradigm for affective computing.