Comparative Analysis of AutoML and BiLSTM Models for Cyberbullying Detection on Indonesian Instagram Comments

arXiv cs.CL / 4/30/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The study compares traditional machine learning and deep learning methods for detecting cyberbullying in Indonesian Instagram comments using a labeled balanced dataset of 650 posts.
  • For classical models, Naive Bayes, Logistic Regression, and SVM with TF-IDF features are evaluated, with Logistic Regression achieving the best performance among them.
  • For deep learning, BiLSTM is benchmarked against BiLSTM augmented with Bahdanau Attention, with the attention-based BiLSTM delivering the strongest overall results.
  • A domain-specific preprocessing pipeline for informal Indonesian text (slang normalization, stopword removal, and stemming) is used, and the study argues that such tailoring improves effectiveness even when comparing architectures.
  • The authors conclude that while deep learning better captures contextual signals, conventional machine learning remains viable for deployments with limited compute or resources.

Abstract

This study compares machine learning and deep learning approaches for cyberbullying detection in Indonesian-language Instagram comments. Using a balanced dataset of 650 comments labeled as Bullying and Non-Bullying, the study evaluates Naive Bayes, Logistic Regression, and Support Vector Machine with TF-IDF features, as well as BiLSTM and BiLSTM with Bahdanau Attention. A preprocessing pipeline tailored to informal Indonesian text is applied, including slang normalization, stopword removal, and stemming. The results show that Logistic Regression performs best among the machine learning models, while BiLSTM with Attention achieves the strongest overall deep learning performance. The findings highlight the value of domain-specific preprocessing and show that although deep learning captures contextual patterns more effectively, machine learning remains a competitive option for resource-constrained deployments.

Comparative Analysis of AutoML and BiLSTM Models for Cyberbullying Detection on Indonesian Instagram Comments | AI Navigate