AI Navigate

Multilingual Financial Fraud Detection Using Machine Learning and Transformer Models: A Bangla-English Study

arXiv cs.LG / 3/13/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The study investigates multilingual Bangla-English financial fraud detection using a dataset of legitimate and fraudulent messages and compares classical ML with TF-IDF features to transformer-based architectures.
  • In 5-fold stratified cross-validation, Linear SVM achieved 91.59% accuracy and 91.30% F1, outperforming the transformer model (89.49% accuracy, 88.88% F1) by about 2 percentage points.
  • The transformer approach exhibited higher fraud recall (94.19%) but incurred higher false positive rates.
  • The results indicate that classical ML with well-crafted features remains competitive for multilingual fraud detection, while also highlighting challenges from linguistic diversity, code-mixing, and low-resource language constraints; the study identifies patterns such as longer scam messages, urgency terms, URLs, and phone numbers.

Abstract

Financial fraud detection has emerged as a critical research challenge amid the rapid expansion of digital financial platforms. Although machine learning approaches have demonstrated strong performance in identifying fraudulent activities, most existing research focuses exclusively on English-language data, limiting applicability to multilingual contexts. Bangla (Bengali), despite being spoken by over 250 million people, remains largely unexplored in this domain. In this work, we investigate financial fraud detection in a multilingual Bangla-English setting using a dataset comprising legitimate and fraudulent financial messages. We evaluate classical machine learning models (Logistic Regression, Linear SVM, and Ensemble classifiers) using TF-IDF features alongside transformer-based architectures. Experimental results using 5-fold stratified cross-validation demonstrate that Linear SVM achieves the best performance with 91.59 percent accuracy and 91.30 percent F1 score, outperforming the transformer model (89.49 percent accuracy, 88.88 percent F1) by approximately 2 percentage points. The transformer exhibits higher fraud recall (94.19 percent) but suffers from elevated false positive rates. Exploratory analysis reveals distinctive patterns: scam messages are longer, contain urgency-inducing terms, and frequently include URLs (32 percent) and phone numbers (97 percent), while legitimate messages feature transactional confirmations and specific currency references. Our findings highlight that classical machine learning with well-crafted features remains competitive for multilingual fraud detection, while also underscoring the challenges posed by linguistic diversity, code-mixing, and low-resource language constraints.