An Experimental Comparison of the Most Popular Approaches to Fake News Detection

arXiv cs.CL / 3/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper provides an experimental, cross-method comparison of 12 fake news detection approaches, covering classical ML, deep learning, transformers, and specialized cross-domain architectures.
  • It evaluates models on 10 public datasets by converting labels into a consistent binary “Real vs Fake” scheme, while noting this harmonization can remove dataset-specific label semantics.
  • Experiments across in-domain, multi-domain, and cross-domain settings show that fine-tuned models typically perform well in-domain but generalize poorly under domain shift and out-of-distribution conditions.
  • Cross-domain architectures can improve robustness, but they are often data-hungry, while LLM-based zero- and few-shot strategies are presented as a promising alternative.
  • The authors caution that dataset confounds and potential pre-training exposure may affect results, framing the study as a robustness evaluation limited to English, text-only fake news classification.

Abstract

In recent years, fake news detection has received increasing attention in public debate and scientific research. Despite advances in detection techniques, the production and spread of false information have become more sophisticated, driven by Large Language Models (LLMs) and the amplification power of social media. We present a critical assessment of 12 representative fake news detection approaches, spanning traditional machine learning, deep learning, transformers, and specialized cross-domain architectures. We evaluate these methods on 10 publicly available datasets differing in genre, source, topic, and labeling rationale. We address text-only English fake news detection as a binary classification task by harmonizing labels into "Real" and "Fake" to ensure a consistent evaluation protocol. We acknowledge that label semantics vary across datasets and that harmonization inevitably removes such semantic nuances. Each dataset is treated as a distinct domain. We conduct in-domain, multi-domain and cross-domain experiments to simulate real-world scenarios involving domain shift and out-of-distribution data. Fine-tuned models perform well in-domain but struggle to generalize. Cross-domain architectures can reduce this gap but are data-hungry, while LLMs offer a promising alternative through zero- and few-shot learning. Given inherent dataset confounds and possible pre-training exposure, results should be interpreted as robustness evaluations within this English, text-only protocol.
広告