CoT2-Meta: Budgeted Metacognitive Control for Test-Time Reasoning
arXiv cs.AI / 3/31/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- CoT2-Meta is a training-free test-time reasoning framework that adds metacognitive control to object-level chain-of-thought generation.
- It uses a meta-controller to decide when to expand, prune, repair, stop, or fall back, guided by strategy-conditioned generation and tree-structured search.
- An online process oracle evaluates step-level reasoning trajectories, enabling more targeted computation allocation under fixed inference budgets.
- Across standard benchmarks (e.g., MATH, GPQA, GSM8K, BBEH, MMMU-Pro, HLE), CoT2-Meta outperforms strong baselines including ReST-MCTS, with reported gains ranging from about +1.15 to +5.2 points on key tasks.
- The paper also reports improved compute scaling, calibration/selective prediction, and consistent effectiveness across a broader set of 15 benchmarks and multiple backbone families.



