The Detection--Extraction Gap: Models Know the Answer Before They Can Say It
arXiv cs.CL / 4/9/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper finds a “detection–extraction gap,” where reasoning models generate substantial chain-of-thought after the correct answer is already recoverable from an early prefix (52–88% of CoT tokens are produced post-commitment).
- It shows that free continuation decoding can recover the correct answer from as little as 10% of the trace, while forced extraction fails in 42% of those cases, implying the model state contains the answer but decoding choices prevent retrieval.
- The authors formalize the mismatch by bounding the total-variation distance between free vs. forced continuation distributions, quantifying how the suffix induces a shift.
- To address the gap, the paper proposes Black-box Adaptive Early Exit (BAEE), using free continuations for both detection and extraction to truncate 70–78% of serial generation and improve accuracy by 1–5 percentage points across tested models and benchmarks.
- For “thinking-mode” models, early exit avoids post-commitment overwriting with gains up to 5.8pp, and a cost-optimized variant reduces API calls (68–73% reduction) at a median of nine calls; code is released on GitHub.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.
Related Articles

Black Hat Asia
AI Business

Meta Superintelligence Lab Releases Muse Spark: A Multimodal Reasoning Model With Thought Compression and Parallel Agents
MarkTechPost

Chatbots are great at manipulating people to buy stuff, Princeton boffins find
The Register
I tested and ranked every ai companion app I tried and here's my honest breakdown
Reddit r/artificial

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to