Select-then-Solve: Paradigm Routing as Inference-Time Optimization for LLM Agents
arXiv cs.CL / 4/9/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The study compares six inference-time reasoning paradigms for LLM agents (Direct, CoT, ReAct, Plan-Execute, Reflection, ReCode) across four frontier models and ten benchmarks, finding that some paradigms improve performance on certain tasks while others significantly degrade it.
- Results show no universally best reasoning paradigm (e.g., ReAct improves GAIA by 44pp over Direct, while CoT drops HumanEval by 15pp), highlighting strong task-dependent complementarity.
- An “oracle per-task selection” approach achieves an average improvement of 17.1pp over the best single fixed paradigm, indicating that choosing the right paradigm per task is crucial.
- The paper proposes “select-then-solve,” where a lightweight embedding-based router selects the best paradigm for each task; across four models it raises average accuracy from 47.6% to 53.1% and recovers up to 37% of the oracle gap, outperforming the best fixed paradigm by 2.8pp.
- The authors find that zero-shot self-routing is unreliable (only effective for GPT-5 at 67.1% and worse for weaker models), strengthening the case for a learned router for per-task paradigm selection.
Related Articles

Black Hat Asia
AI Business

Amazon CEO takes aim at Nvidia, Intel, Starlink, more in annual shareholder letter
TechCrunch

Why Anthropic’s new model has cybersecurity experts rattled
Reddit r/artificial
Does the AI 2027 paper still hold any legitimacy?
Reddit r/artificial

Why Most Productivity Systems Fail (And What to Do Instead)
Dev.to