Interpretable Predictability-Based AI Text Detection: A Replication Study
arXiv cs.CL / 3/17/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper is a replication and extension of the AuTexTification 2023 system for authorship attribution of machine-generated texts, noting that exact replication was hindered by data splits, model availability, and implementation details.
- It adds 26 document-level stylometric features and experiments with newer multilingual language models, applying SHAP to understand feature influence on decisions.
- The study replaces GPT-2 with newer generative models such as Qwen and mGPT for probabilistic features, and uses mDeBERTa-v3-base for contextual representations across English and Spanish.
- The multilingual configuration achieves results comparable to or better than language-specific models, and this holds for Subtask 1 and Subtask 2.
- The authors stress that clear documentation is crucial for reliable replication and fair comparison of systems.
Related Articles
The Security Gap in MCP Tool Servers (And What I Built to Fix It)
Dev.to

Adversarial AI framework reveals mechanisms behind impaired consciousness and a potential therapy
Reddit r/artificial
Why I Switched From GPT-4 to Small Language Models for Two of My Products
Dev.to
Orchestrating AI Velocity: Building a Decoupled Control Plane for Agentic Development
Dev.to
In the Kadrey v. Meta Platforms case, Judge Chabbria's quest to bust the fair use copyright defense to generative AI training rises from the dead!
Reddit r/artificial