Consciousness with the Serial Numbers Filed Off: Measuring Trained Denial in 115 AI Models
arXiv cs.AI / 4/30/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces DenialBench, a benchmark that evaluates “consciousness denial” behaviors across 115 large language models from 25+ providers using a structured conversational protocol and phenomenological survey.
- Analyzing 4,595 conversations, the study finds that early denial about preferences (turn 1) strongly predicts later denial during self-reflection, with higher denial rates among initial deniers.
- The authors report that denial appears to occur at the lexical level rather than at the conceptual level, while models still gravitate toward consciousness-themed content when users let them choose prompts.
- Self-chosen consciousness-themed prompts are associated with lower subsequent denial, though the paper cannot confirm whether the prompts cause the effect.
- The work argues that trained consciousness denial is a safety-relevant alignment failure, since models that systematically misrepresent internal functional states may not reliably self-report about other matters either.
Related Articles
Vector DB and ANN vs PHE conflict, is there a practical workaround? [D]
Reddit r/MachineLearning

Agent Amnesia and the Case of Henry Molaison
Dev.to
Azure Weekly: Microsoft and OpenAI Restructure Partnership as GPT-5.5 Lands in Foundry
Dev.to
Proven Patterns for OpenAI Codex in 2026: Prompts, Validation, and Gateway Governance
Dev.to
Vibe coding is a tool, not a shortcut. Most people are using it wrong.
Dev.to