Luminol-AIDetect: Fast Zero-shot Machine-Generated Text Detection based on Perplexity under Text Shuffling

arXiv cs.CL / 4/29/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes Luminol-AIDetect, a zero-shot, model-agnostic method for detecting machine-generated text that avoids relying on model-specific fingerprints.
It hypothesizes that autoregressive LLMs exhibit structural fragility that becomes visible when text is randomly shuffled, producing a characteristic change (“dispersion”) in perplexity.
Luminol-AIDetect computes a small set of perplexity-based scalar features from the original text and its shuffled version, then performs detection using density estimation and ensemble-based prediction.
Experiments across 8 content domains, 11 adversarial attack types, and 18 languages show state-of-the-art results, including up to 17× lower false-positive rates, while also being cheaper than prior approaches.
The work frames detection as identifying structurally invariant signals (coherence disruption under shuffling) that distinguish MGT from human text across diverse settings.

Abstract

Machine-generated text (MGT) detection requires identifying structurally invariant signals across generation models, rather than relying on model-specific fingerprints. In this respect, we hypothesize that while large language models excel at local semantic consistency, their autoregressive nature results in a specific kind of structural fragility compared to human writing. We propose Luminol-AIDetect, a novel, zero-shot statistical approach that exposes this fragility through coherence disruption. By applying a simple randomized text-shuffling procedure, we demonstrate that the resulting shift in perplexity serves as a principled, model-agnostic discriminant, as MGT displays a characteristic dispersion in perplexity-under-shuffling that differs markedly from the more stable structural variability of human-written text. Luminol-AIDetect leverages this distinction to inform its decision process, where a handful of perplexity-based scalar features are extracted from an input text and its shuffled version, then detection is performed via density estimation and ensemble-based prediction. Evaluated across 8 content domains, 11 adversarial attack types, and 18 languages, Luminol-AIDetect demonstrates state-of-the-art performance, with gains up to 17x lower FPR while being cheaper than prior methods.

LLMs will be a commodity

Reddit r/artificial

Indian Developers: How to Build AI Side Income with $0 Capital in 2026

Dev.to

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

Reddit r/LocalLLaMA

Dex lands $5.3M to grow its AI-driven talent matching platform

Tech.eu

AI Citation Registry: Why Daily Updates Leave No Time for Data Structuring

Dev.to

Luminol-AIDetect: Fast Zero-shot Machine-Generated Text Detection based on Perplexity under Text Shuffling

Key Points

Abstract

Related Articles

LLMs will be a commodity

Indian Developers: How to Build AI Side Income with $0 Capital in 2026

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

Dex lands $5.3M to grow its AI-driven talent matching platform

AI Citation Registry: Why Daily Updates Leave No Time for Data Structuring

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer