Zero-Shot to Full-Resource: Cross-lingual Transfer Strategies for Aspect-Based Sentiment Analysis

arXiv cs.CL / 4/30/2026

📰 NewsModels & Research

Key Points

  • The paper evaluates aspect-based sentiment analysis (ABSA) across seven languages and four subtasks, addressing the field’s relative lack of non-English coverage despite recent transformer advances.
  • It compares transformer architectures across three data regimes—zero-resource, data-only, and full-resource—using cross-lingual transfer, code-switching, and machine translation.
  • Fine-tuned LLMs deliver the best overall results, especially for more complex generative ABSA tasks, while few-shot methods can nearly match them in simpler settings.
  • Cross-lingual training on multiple non-target languages is most beneficial for fine-tuned LLMs, whereas code-switching yields the biggest gains for smaller encoder and seq-to-seq models.
  • The authors release two new German datasets (an adapted GERest and the first German ASQP dataset, GERest) to support multilingual ABSA research.

Abstract

Aspect-based Sentiment Analysis (ABSA) extracts fine-grained opinions toward specific aspects within text but remains largely English-focused despite major advances in transformer-based and instruction-tuned models. This work presents a multilingual evaluation of state-of-the-art ABSA approaches across seven languages (English, German, French, Dutch, Russian, Spanish, and Czech) and four subtasks (ACD, ACSA, TASD, ASQP). We systematically compare different transformer architectures under zero-resource, data-only, and full-resource settings, using cross-lingual transfer, code-switching and machine translation. Fine-tuned Large Language Models (LLMs) achieve the highest overall scores, particularly in complex generative tasks, while few-shot counterparts approach this performance in simpler setups, where smaller encoder models also remain competitive. Cross-lingual training on multiple non-target languages yields the strongest transfer for fine-tuned LLMs, while smaller encoder or seq-to-seq models benefit most from code-switching, highlighting architecture-specific strategies for multilingual ABSA. We further contribute two new German datasets, an adapted GERestaurant and the first German ASQP dataset (GERest), to encourage multilingual ABSA research beyond English.