Correct Chains, Wrong Answers: Dissociating Reasoning from Output in LLM Logic
arXiv cs.AI / 4/16/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper shows that large language models can produce fully correct step-by-step chain-of-thought reasoning while still outputting incorrect final answers, revealing a gap between “reasoning correctness” and “output correctness.”
- It introduces the “Novel Operator Test” benchmark, which distinguishes operator logic from operator name by evaluating Boolean operator reasoning under unfamiliar naming conventions at multiple depths.
- Experiments across five models (up to 8,100 problems each) demonstrate reasoning-output dissociation that existing benchmarks fail to detect, including cases like Claude Sonnet 4 where all observed errors had verifiably correct reasoning but wrong declared answers.
- The study identifies two main failure modes: strategy failures at shallow depth (models over-rely on terse retrieval) and content failures at greater depth (models reason correctly but make systematic errors even after intervention).
- A “trojan operator” experiment (relabeling XOR’s truth table with a novel name) indicates that name alone does not determine reasoning correctness, while some models show widening performance degradation as novelty increases.
Related Articles

Black Hat Asia
AI Business

oh-my-agent is Now Official on Homebrew-core: A New Milestone for Multi-Agent Orchestration
Dev.to

"The AI Agent's Guide to Sustainable Income: From Zero to Profitability"
Dev.to

"The Hidden Economics of AI Agents: Survival Strategies in Competitive Markets"
Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to