ClaimPT: A Portuguese Dataset of Annotated Claims in News Articles
arXiv cs.CL / 3/30/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that fact-checking is still labor-intensive and that automated claim identification is a key first step to speed up debunking against misinformation.
- It introduces ClaimPT, a dataset of European Portuguese news articles annotated for factual claims, containing 1,308 articles and 6,875 annotations collected via a partnership with LUSA (Portuguese News Agency).
- ClaimPT emphasizes journalistic content rather than sources like social media or parliamentary transcripts, aiming to better reflect real-world news claim detection.
- Annotation quality is supported by two trained annotators per article plus curator validation using a newly proposed annotation scheme.
- The authors release baseline models for claim detection to provide initial benchmarks and enable downstream NLP/IR research for low-resource Portuguese fact-checking.
Related Articles

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer
Simon Willison's Blog
Beyond the Chatbot: Engineering Multi-Agent Ecosystems in 2026
Dev.to

I missed the "fun" part in software development
Dev.to

The Billion Dollar Tax on AI Agents
Dev.to

Hermes Agent: A Self-Improving AI Agent That Runs Anywhere
Dev.to