IndoBERT-Sentiment: Context-Conditioned Sentiment Classification for Indonesian Text

arXiv cs.CL / 4/9/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces IndoBERT-Sentiment, a context-conditioned model that uses both topical context and Indonesian text to improve sentiment classification versus context-free approaches.
  • IndoBERT-Sentiment is built on IndoBERT Large (335M parameters) and trained on 31,360 labeled context-text pairs spanning 188 topics.
  • The model reports strong performance, achieving an F1 macro of 0.856 and accuracy of 88.1% on its evaluation.
  • In comparisons on the same test set, it outperforms three widely used general-purpose Indonesian sentiment baselines by 35.6 F1 points.
  • The authors argue that transferring context-conditioning (previously used for relevancy) effectively boosts sentiment classification, including correcting systematic errors made by isolation-based models.

Abstract

Existing Indonesian sentiment analysis models classify text in isolation, ignoring the topical context that often determines whether a statement is positive, negative, or neutral. We introduce IndoBERT-Sentiment, a context-conditioned sentiment classifier that takes both a topical context and a text as input, producing sentiment predictions grounded in the topic being discussed. Built on IndoBERT Large (335M parameters) and trained on 31,360 context-text pairs labeled across 188 topics, the model achieves an F1 macro of 0.856 and accuracy of 88.1%. In a head-to-head evaluation against three widely used general-purpose Indonesian sentiment models on the same test set, IndoBERT-Sentiment outperforms the best baseline by 35.6 F1 points. We show that context-conditioning, previously demonstrated for relevancy classification, transfers effectively to sentiment analysis and enables the model to correctly classify texts that are systematically misclassified by context-free approaches.