When Hate Meets Facts: LLMs-in-the-Loop for Check-worthiness Detection in Hate Speech
arXiv cs.CL / 3/27/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper releases WSF-ARG+, a dataset that jointly labels hate speech with “check-worthiness” (whether embedded claims merit fact-checking) to address the overlap between hate content and misinformation.
- It proposes an LLM-in-the-loop annotation framework that uses 12 open-weight LLMs to reduce human annotation effort while maintaining annotation quality, validated via extensive human evaluation.
- The authors find that hate speech containing check-worthy claims is associated with significantly higher harassment and hate intensity.
- Incorporating check-worthiness labels improves LLM-based hate speech detection performance, reporting gains up to 0.213 macro-F1 for large models (and 0.154 macro-F1 on average).
Related Articles
I Extended the Trending mcp-brasil Project with AI Generation — Full Tutorial
Dev.to
The Rise of Self-Evolving AI: From Stanford Theory to Google AlphaEvolve and Berkeley OpenSage
Dev.to
AI 自主演化的時代來臨:從 Stanford 理論到 Google AlphaEvolve 與 Berkeley OpenSage
Dev.to
Neural Networks in Mobile Robot Motion
Dev.to
Retraining vs Fine-tuning or Transfer Learning? [D]
Reddit r/MachineLearning