Authorship Impersonation via LLM Prompting does not Evade Authorship Verification Methods

arXiv cs.CL / 4/1/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The study tests whether prompt-based LLMs (using GPT-4o) can create convincing author impersonations and whether those texts can evade existing authorship verification (AV) systems.
  • Impersonation attempts were generated under four prompting conditions across three genres—emails, text messages, and social media posts.
  • Evaluations against multiple non-neural and neural AV methods using a likelihood-ratio framework found that LLM outputs did not sufficiently replicate individual authorial signatures to bypass established systems.
  • Some AV methods rejected LLM impersonation texts more accurately than genuine negative samples, suggesting AV systems can effectively distinguish impersonations.
  • The paper attributes the resilience in part to lexical diversity and higher entropy in LLM-generated text, which may weaken impersonation mimicry.

Abstract

Authorship verification (AV), the task of determining whether a questioned text was written by a specific individual, is a critical part of forensic linguistics. While manual authorial impersonation by perpetrators has long been a recognized threat in historical forensic cases, recent advances in large language models (LLMs) raise new challenges, as adversaries may exploit these tools to impersonate another's writing. This study investigates whether prompted LLMs can generate convincing authorial impersonations and whether such outputs can evade existing forensic AV systems. Using GPT-4o as the adversary model, we generated impersonation texts under four prompting conditions across three genres: emails, text messages, and social media posts. We then evaluated these outputs against both non-neural AV methods (n-gram tracing, Ranking-Based Impostors Method, LambdaG) and neural approaches (AdHominem, LUAR, STAR) within a likelihood-ratio framework. Results show that LLM-generated texts failed to sufficiently replicate authorial individuality to bypass established AV systems. We also observed that some methods achieved even higher accuracy when rejecting impersonation texts compared to genuine negative samples. Overall, these findings indicate that, despite the accessibility of LLMs, current AV systems remain robust against entry-level impersonation attempts across multiple genres. Furthermore, we demonstrate that this counter-intuitive resilience stems, at least in part, from the higher lexical diversity and entropy inherent in LLM-generated texts.