Benchmarking LightGBM and BiLSTM for Sentiment Analysis on Indonesian E-Commerce Reviews

arXiv cs.CL / 5/5/2026

📰 NewsModels & Research

Key Points

  • The study compares traditional machine learning via PyCaret AutoML and deep learning for Indonesian e-commerce sentiment analysis using a 15,000-sample dataset from Hugging Face.
  • For ML, it evaluates LightGBM, Logistic Regression, and SVM, while the DL approach uses a BiLSTM (Bidirectional LSTM) network to model sequential context.
  • The results show the BiLSTM model achieves the best performance overall, reaching 98.87% accuracy and an F1-score of 98.87%.
  • Among the ML methods, LightGBM performs best with 98.23% accuracy while also requiring highly efficient training time.
  • The authors conclude that BiLSTM is especially effective for capturing the sequential semantics in Indonesian review text for this sentiment classification task.

Abstract

This study presents a comparative analysis between two primary approaches in Natural Language Processing (NLP): Machine Learning (ML) utilizing the PyCaret AutoML framework, and Deep Learning (DL). The evaluation is conducted on a sentiment analysis task using an Indonesian e-commerce review dataset sourced from Hugging Face. The dataset, consisting of 15,000 samples, is partitioned into training, validation, and testing sets. The ML experiments compare LightGBM, Logistic Regression, and Support Vector Machine (SVM) algorithms, whereas the DL experiment implements a Bidirectional Long Short-Term Memory (BiLSTM) architecture. The experimental results demonstrate that the BiLSTM model outperforms all ML models, achieving an accuracy of 98.87\% and an F1-Score of 98.87\%. Meanwhile, LightGBM emerges as the best-performing ML model with an accuracy of 98.23\% in a highly efficient training time. This research proves that the BiLSTM architecture is highly capable of capturing the sequential context of Indonesian review texts, making it the superior model for this specific classification task.