Beyond Detection: Governing GenAI in Academic Peer Review as a Sociotechnical Challenge

arXiv cs.AI / 3/24/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies how generative AI is being discussed and experienced in academic peer review, combining social media discourse analysis (448 posts) with interviews of area and program chairs from major AI/HCI conferences.
  • It finds broad consensus that GenAI can be acceptable for limited supportive tasks (e.g., improving clarity and structuring feedback) but that core evaluative judgments—such as assessing novelty, contributions, and acceptance—should remain human responsibilities.
  • Participants raise sociotechnical risks including epistemic harm, over-standardization, unclear accountability, and adversarial threats like prompt injection.
  • The work argues that institutional strain and ambiguous policies shift enforcement burdens onto individual scholars, disproportionately impacting junior authors and reviewers.
  • It concludes that governing GenAI in peer review should rely on explicit, role-specific controls and enforceable boundaries for “support vs. evaluation,” rather than blanket bans or detection-only approaches.

Abstract

Generative AI tools are increasingly entering academic peer review workflows, raising questions about fairness, accountability, and the legitimacy of evaluative judgment. While these systems promise efficiency gains amid growing reviewer overload, their use introduces new sociotechnical risks. This paper presents a convergent mixed-method study combining discourse analysis of 448 social media posts with interviews with 14 area chairs and program chairs from leading AI and HCI conferences to examine how GenAI is discussed and experienced in peer review. Across both datasets, we find broad agreement that GenAI may be acceptable for limited supportive tasks, such as improving clarity or structuring feedback, but that core evaluative judgments, assessing novelty, contribution, and acceptance, should remain human responsibilities. At the same time, participants highlight concerns about epistemic harm, over-standardization, unclear responsibility, and adversarial risks such as prompt injection. User interviews reveal how structural strain and institutional policy ambiguity shift interpretive and enforcement burdens onto individual scholars, disproportionately affecting junior authors and reviewers. By triangulating public governance discourse with lived review practices, this work reframes AI mediated peer review as a sociotechnical governance challenge and offers recommendations for preserving accountability, trust, and meaningful human oversight. Overall, we argue that AI-assisted peer review is best governed not by blanket bans or detection alone, but by explicitly reserving evaluative judgment for humans while instituting enforceable, role-specific controls that preserve accountability. We conclude with role specific recommendations that formalize the support judgment boundary.