DimABSA: Building Multilingual and Multidomain Datasets for Dimensional Aspect-Based Sentiment Analysis

arXiv cs.CL / 4/27/2026

📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • The paper proposes a dimensional approach to aspect-based sentiment analysis by using continuous valence-arousal (VA) scores instead of coarse positive/negative labels.
  • It introduces DimABSA, a multilingual and multidomain dataset annotated with standard ABSA elements (aspect terms/categories and opinion terms) plus VA scores, covering six languages, four domains, and 76,958 aspect instances.
  • The authors define three subtasks that combine VA prediction with different ABSA elements, bridging conventional categorical ABSA to dimensional ABSA.
  • To evaluate these mixed categorical/continuous tasks, they introduce a new unified metric called continuous F1 (cF1) that accounts for VA prediction error.
  • A benchmark is reported using both prompted and fine-tuned large language models, and the dataset has been publicly released and used in Track A of SemEval-2026 Task 3 with 300+ participants.

Abstract

Aspect-Based Sentiment Analysis (ABSA) focuses on extracting sentiment at a fine-grained aspect level and has been widely applied across real-world domains. However, existing ABSA research relies on coarse-grained categorical labels (e.g., positive, negative), which limits its ability to capture nuanced affective states. To address this limitation, we adopt a dimensional approach that represents sentiment with continuous valence-arousal (VA) scores, enabling fine-grained analysis at both the aspect and sentiment levels. To this end, we introduce DimABSA, the first multilingual, dimensional ABSA resource annotated with both traditional ABSA elements (aspect terms, aspect categories, and opinion terms) and newly introduced VA scores. This resource contains 76,958 aspect instances across 42,590 sentences, spanning six languages and four domains. We further introduce three subtasks that combine VA scores with different ABSA elements, providing a bridge from traditional ABSA to dimensional ABSA. Given that these subtasks involve both categorical and continuous outputs, we propose a new unified metric, continuous F1 (cF1), which incorporates VA prediction error into standard F1. We provide a comprehensive benchmark using both prompted and fine-tuned large language models across all subtasks. Our results show that DimABSA is a challenging benchmark and provides a foundation for advancing multilingual dimensional ABSA. We publicly released the DimABSA dataset, which was used for Track A of SemEval-2026 Task 3, attracting over 300 participants.