Adaptive Multi-Expert Reasoning via Difficulty-Aware Routing and Uncertainty-Guided Aggregation

arXiv cs.CL / 4/14/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes Adaptive Multi-Expert Reasoning (AMR), which routes math problems to dynamically chosen strategies based on predicted difficulty and uncertainty.
  • AMR uses an “agile routing” component plus a reconfigurable sampling mechanism to control generation breadth, then produces candidate solutions via multiple specialized experts.
  • It refines candidates through iterative correction/finalization phases and uses a neural verifier to assess correctness.
  • A clustering-based aggregation step selects the final answer using both consensus across candidates and answer quality.
  • On GSM8K, AMR reaches 75.28% accuracy using only original training data, outperforming many comparable 7B models trained on synthetic data, indicating improved robustness through difficulty-aware routing.

Abstract

Large language models (LLMs) demonstrate strong performance in math reasoning benchmarks, but their performance varies inconsistently across problems with varying levels of difficulty. This paper describes Adaptive Multi-Expert Reasoning (AMR), a framework that focuses on problem complexity by reasoning with dynamically adapted strategies. An agile routing system that focuses on problem text predicts problems' difficulty and uncertainty and guides a reconfigurable sampling mechanism to manage the breadth of generation. Three specialized experts create candidate responses, which are modified during multiple correction and finalization phases. A neural verifier assesses the correctness of responses, while a clustering-based aggregation technique identifies the final candidate answer based on a combination of consensus and answer quality. When evaluated on the GSM8K dataset, AMR achieved 75.28% accuracy while only using the original training data. This result outperformed the majority of comparable 7B models that were trained on synthetic data. This showcases that models using difficulty-based routing and uncertainty-driven aggregation are efficient and effective in improving math reasoning models' robustness.