Beyond Semantics: Measuring Fine-Grained Emotion Preservation in Small Language Model-Based Machine Translation

arXiv cs.CL / 5/1/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper tests how well three small language models (EuroLLM, Aya Expanse, and Gemma) preserve fine-grained emotional nuance in machine translation, where semantics are often prioritized over affect.
It uses the GoEmotions dataset (Reddit comments labeled into 28 emotion categories) to evaluate emotion preservation across five European languages via a backtranslation setup.
The study examines whether the models’ inherent emotion-retention ability is sufficient, and whether emotion-aware prompting can further improve emotional fidelity.
It also assesses ModernBERT as a contemporary alternative to BERT for emotion classification to support MT evaluation.
Overall, the work provides an evaluation framework and comparative results focused specifically on emotional preservation rather than only semantic equivalence.

Abstract

Preserving affective nuance remains a challenge in Machine Translation (MT), where semantic equivalence often takes precedence over emotional fidelity. This paper evaluates the performance of three state-of-the-art Small Language Models (SLMs) -- EuroLLM, Aya Expanse, and Gemma -- in maintaining fine-grained emotions during backtranslation. Using the GoEmotions dataset, which comprises Reddit comments across 28 distinct categories, we assess emotional preservation across five European languages: German, French, Spanish, Italian, and Polish. Specifically, we investigate (i) the inherent capability of these SLMs to retain emotional sentiment, (ii) the efficacy of emotion-aware prompting in improving preservation, and (iii) the performance of ModernBERT as a contemporary alternative to BERT for emotion classification in MT evaluation.