Not All Pretraining are Created Equal: Threshold Tuning and Class Weighting for Imbalanced Polarization Tasks in Low-Resource Settings

arXiv cs.LG / 3/26/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper presents Transformer-based solutions for the SemEval-2025 Polarization Shared Task, covering binary polarization detection plus two multi-label classification subtasks in both English and Swahili.
  • It improves performance under severe class imbalance in low-resource settings using class-weighted loss, iterative stratified data splitting, and per-label threshold tuning.
  • The approach combines multilingual and African-language-specialized models (mDeBERTa-v3-base, SwahBERT, AfriBERTa-large), with the best validation result reported for mDeBERTa-v3-base.
  • Reported results reach 0.8032 macro-F1 on validation for binary detection and up to 0.556 macro-F1 on multi-label tasks, indicating competitive effectiveness but room for gains.
  • Error analysis highlights ongoing difficulties with implicit polarization, code-switching, and separating heated political rhetoric from true polarization signals.

Abstract

This paper describes my submission to the Polarization Shared Task at SemEval-2025, which addresses polarization detection and classification in social media text. I develop Transformer-based systems for English and Swahili across three subtasks: binary polarization detection, multi-label target type classification, and multi-label manifestation identification. The approach leverages multilingual and African language-specialized models (mDeBERTa-v3-base, SwahBERT, AfriBERTa-large), class-weighted loss functions, iterative stratified data splitting, and per-label threshold tuning to handle severe class imbalance. The best configuration, mDeBERTa-v3-base, achieves 0.8032 macro-F1 on validation for binary detection, with competitive performance on multi-label tasks (up to 0.556 macro-F1). Error analysis reveals persistent challenges with implicit polarization, code-switching, and distinguishing heated political discourse from genuine polarization.