Beyond the Final Actor: Modeling the Dual Roles of Creator and Editor for Fine-Grained LLM-Generated Text Detection

arXiv cs.CL / 4/7/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that existing LLM-generated text detectors are too coarse (binary/ternary) to support nuanced policy regulation, such as distinguishing LLM-polished human text from humanized LLM text.
  • It introduces a rigorous four-class detection setting and proposes RACE (Rhetorical Analysis for Creator-Editor Modeling) to model creator vs. editor differences in generated text.
  • RACE uses Rhetorical Structure Theory to build a logic graph representing the creator’s foundation and extracts Elementary Discourse Unit-level features to capture the editor’s style.
  • Experimental results indicate RACE outperforms 12 baselines, achieving better fine-grained identification with low false alarms, aiming for policy-aligned detection.
  • The work frames LLM misuse detection as a creator–editor interaction problem rather than a single source classification task, improving interpretability for governance use cases.

Abstract

The misuse of large language models (LLMs) requires precise detection of synthetic text. Existing works mainly follow binary or ternary classification settings, which can only distinguish pure human/LLM text or collaborative text at best. This remains insufficient for the nuanced regulation, as the LLM-polished human text and humanized LLM text often trigger different policy consequences. In this paper, we explore fine-grained LLM-generated text detection under a rigorous four-class setting. To handle such complexities, we propose RACE (Rhetorical Analysis for Creator-Editor Modeling), a fine-grained detection method that characterizes the distinct signatures of creator and editor. Specifically, RACE utilizes Rhetorical Structure Theory to construct a logic graph for the creator's foundation while extracting Elementary Discourse Unit-level features for the editor's style. Experiments show that RACE outperforms 12 baselines in identifying fine-grained types with low false alarms, offering a policy-aligned solution for LLM regulation.