AI Navigate

Enhancing Multi-Label Emotion Analysis and Corresponding Intensities for Ethiopian Languages

arXiv cs.CL / 3/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The EthioEmo dataset for Ethiopian languages is extended with emotion intensity annotations in a multi-label framework to capture varying emotional expressions.
  • The work benchmarks encoder-only pretrained language models and open-source LLMs, finding African-centric encoder-only models consistently outperform LLMs on this task.
  • Incorporating emotion-intensity features improves multi-label emotion classification performance on the enriched EthioEmo dataset.
  • The dataset and findings highlight the importance of culturally and linguistically tailored small models for emotion understanding, with data available on HuggingFace.

Abstract

Developing and integrating emotion-understanding models are essential for a wide range of human-computer interaction tasks, including customer feedback analysis, marketing research, and social media monitoring. Given that users often express multiple emotions simultaneously within a single instance, annotating emotion datasets in a multi-label format is critical for capturing this complexity. The EthioEmo dataset, a multilingual and multi-label emotion dataset for Ethiopian languages, lacks emotion intensity annotations, which are crucial for distinguishing varying degrees of emotion, as not all emotions are expressed with the same intensity. We extend the EthioEmo dataset to address this gap by adding emotion intensity annotations. Furthermore, we benchmark state-of-the-art encoder-only Pretrained Language Models (PLMs) and Large Language Models (LLMs) on this enriched dataset. Our results demonstrate that African-centric encoder-only models consistently outperform open-source LLMs, highlighting the importance of culturally and linguistically tailored small models in emotion understanding. Incorporating an emotion-intensity feature for multi-label emotion classification yields better performance. The data is available at https://huggingface.co/datasets/Tadesse/EthioEmo-intensities.