Large Language Models in the Abuse Detection Pipeline
arXiv cs.CL / 4/3/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper surveys how large language models (LLMs) can be integrated into the full Abuse Detection Lifecycle (ADL) to handle increasingly complex online abuse beyond what static classifiers and heavy labeling can manage.
- It breaks the ADL into four stages—Label & Feature Generation, Detection, Review & Appeals, and Auditing & Governance—and synthesizes emerging research and industry practices for each stage.
- The authors describe production-relevant architectural considerations and discuss where LLMs add value, including contextual reasoning, policy interpretation, explanation generation, and cross-modal understanding.
- The paper also emphasizes limitations and operational challenges for LLM-driven abuse detection, focusing on latency, cost-efficiency, determinism, adversarial robustness, and fairness.
- It concludes with key future research directions needed to make LLMs reliable and accountable components in large-scale, governed safety systems.
Related Articles

Black Hat Asia
AI Business

Mistral raises $830M, 9fin hits unicorn status, and new Tech.eu Summit speakers unveiled
Tech.eu

ChatGPT costs $20/month. I built an alternative for $2.99.
Dev.to

OpenAI shifts to usage-based pricing for Codex in ChatGPT business plans
THE DECODER

Why I built an AI assistant that doesn't know who you are
Dev.to