MALicious INTent Dataset and Inoculating LLMs for Enhanced Disinformation Detection
arXiv cs.CL / 3/17/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- MALINT is the first human-annotated English corpus capturing disinformation and malicious intent, developed with expert fact-checkers.
- The work benchmarks 12 language models, including small models like BERT and large models such as Llama 3.3, on binary and multilabel intent classification tasks.
- It proposes intent-based inoculation, an intent-augmented reasoning approach for LLMs to mitigate the persuasive impact of disinformation by integrating intent analysis.
- The authors demonstrate that intent-augmented reasoning improves zero-shot disinformation detection across six datasets, five LLMs, and seven languages, and they release the MALINT dataset with annotations.




