Decoding the Critique Mechanism in Large Reasoning Models
arXiv cs.LG / 3/18/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- Large Reasoning Models exhibit backtracking and self-verification, and the paper argues that strong critique ability is needed to detect errors and trigger self-correction.
- By deliberately inserting arithmetic mistakes into intermediate reasoning steps, the study shows that models can still arrive at correct final answers, revealing an internal hidden critique mechanism.
- The authors identify a highly interpretable 'critique vector' in latent space and demonstrate that steering representations along this vector improves error detection without additional training.
- Experiments across multiple model scales and families suggest the critique mechanism is robust and can be exploited to improve self-verification and test-time scaling.
- The authors provide code at https://github.com/mail-research/lrm-critique-vectors to reproduce and extend their results.
Related Articles

Astral to Join OpenAI
Dev.to

PearlOS. We gave swarm intelligence a local desktop environment and code control to self-evolve. Has been pretty incredible to see so far. Open source and free if you want your own.
Reddit r/LocalLLaMA

Why Data is Important for LLM
Dev.to

The Inference Market Is Consolidating. Agent Payments Are Still Nobody's Problem.
Dev.to

YouTube's Deepfake Shield for Politicians Changes Evidence Forever
Dev.to