Dynamic Adaptive Attention and Supervised Contrastive Learning: A Novel Hybrid Framework for Text Sentiment Classification

arXiv cs.CL / 4/14/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces a hybrid sentiment-classification framework built on a BERT-based Transformer encoder that combines dynamic adaptive multi-head attention with supervised contrastive learning.
  • The dynamic adaptive attention uses a global context pooling vector to dynamically weight each attention head, improving focus on sentiment-critical tokens and reducing noise from irrelevant parts of long reviews.
  • The supervised contrastive learning branch reshapes the embedding space by encouraging tighter intra-class clustering and stronger inter-class separation.
  • Experiments on the IMDB dataset report 94.67% accuracy, exceeding prior strong baselines by 1.5–2.5 percentage points, while claiming the approach is lightweight and extensible to other text classification tasks.
  • Overall, the work targets common weaknesses of standard BERT/recurrent models in capturing long-range dependencies and handling ambiguous emotional expressions in lengthy texts.

Abstract

The exponential growth of user-generated movie reviews on digital platforms has made accurate text sentiment classification a cornerstone task in natural language processing. Traditional models, including standard BERT and recurrent architectures, frequently struggle to capture long-distance semantic dependencies and resolve ambiguous emotional expressions in lengthy review texts. This paper proposes a novel hybrid framework that seamlessly integrates dynamic adaptive multi-head attention with supervised contrastive learning into a BERT-based Transformer encoder. The dynamic adaptive attention module employs a global context pooling vector to dynamically regulate the contribution of each attention head, thereby focusing on critical sentiment-bearing tokens while suppressing noise. Simultaneously, the supervised contrastive learning branch enforces tighter intra-class compactness and larger inter-class separation in the embedding space. Extensive experiments on the IMDB dataset demonstrate that the proposed model achieves competitive performance with an accuracy of 94.67\%, outperforming strong baselines by 1.5--2.5 percentage points. The framework is lightweight, efficient, and readily extensible to other text classification tasks.