When Chain-of-Thought Fails, the Solution Hides in the Hidden States

arXiv cs.CL / 4/28/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The study examines whether chain-of-thought (CoT) intermediate tokens are computationally useful by testing if token-level hidden states contain task-relevant information.
Using mechanistic causal analysis with activation patching on GSM8K, the researchers transfer hidden states from a CoT run into a direct-answer run and find that patched generation can significantly outperform both direct prompting and the original (possibly incorrect) CoT trace.
Task-relevant information in CoT appears more often in correct than incorrect runs, is unevenly distributed across tokens, and concentrates in mid-to-late transformer layers, often showing up earlier in the reasoning.
The paper finds that linguistic tokens (e.g., verbs and entities) are more likely to steer reasoning toward correctness, while mathematical tokens tend to encode answer-proximal details that are less effective for recovery.
Patched outputs are frequently shorter than full CoT chains yet achieve higher accuracy, implying that complete step-by-step reasoning traces may not always be required to solve the problem.

Abstract

Whether intermediate reasoning is computationally useful or merely explanatory depends on whether chain-of-thought (CoT) tokens contain task-relevant information. We present a mechanistic causal analysis of CoT on GSM8K using activation patching: transferring token-level hidden states from a CoT generation to a direct-answer run for the same question, then measuring the effect on final-answer accuracy. Across models, generating after patching yields substantially higher accuracy than both direct-answer prompting and the original CoT trace, revealing that individual CoT tokens can encode sufficient information to recover the correct answer, even when the original trace is incorrect. This task-relevant information is more prevalent in correct than incorrect CoT runs and is unevenly distributed across tokens, concentrating in mid-to-late layers and appearing earlier in the reasoning trace. Moreover, patching language tokens such as verbs and entities carry task-solving information that steers generation toward correct reasoning, whereas mathematical tokens encode answer-proximal content that rarely succeeds. Patched outputs are often shorter and yet exceed the accuracy of a full CoT trace, suggesting complete reasoning chains are not always necessary. Together, these findings demonstrate that CoT encodes recoverable, token-level problem-solving information, offering new insight into how reasoning is represented and where it breaks down.

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Everyone Wants AI Agents. Fewer Teams Are Ready for the Messy Business Context Behind Them

Dev.to

AI 编程工具对比 2026：Claude Code vs Cursor vs Gemini CLI vs Codex

Dev.to

How I Improved My YouTube Shorts and Podcast Audio Workflow with AI Tools

Dev.to

An improvement of the convergence proof of the ADAM-Optimizer

Dev.to

When Chain-of-Thought Fails, the Solution Hides in the Hidden States

Key Points

Abstract

Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Everyone Wants AI Agents. Fewer Teams Are Ready for the Messy Business Context Behind Them

AI 编程工具对比 2026：Claude Code vs Cursor vs Gemini CLI vs Codex

How I Improved My YouTube Shorts and Podcast Audio Workflow with AI Tools

An improvement of the convergence proof of the ADAM-Optimizer

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer