Optimized Deferral for Imbalanced Settings

arXiv cs.LG / 5/1/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies “learning to defer” methods, which send uncertain or complex inputs to specialized experts to improve accuracy while controlling computational cost.
  • It identifies a key limitation in two-stage learning-to-defer setups—an expert imbalance problem where the deferral mechanism can over-prefer the majority expert.
  • The authors reformulate deferral loss optimization as a cost-sensitive learning problem over the input–expert domain and propose new margin-based loss functions with setting-specific guarantees.
  • They introduce MILD (Margin-based Imbalanced Learning to Defer), a principled algorithm designed specifically for imbalanced experts.
  • Experiments on image classification and real-world LLM routing tasks show MILD delivers clear improvements over prior baselines.

Abstract

Learning algorithms can be significantly improved by routing complex or uncertain inputs to specialized experts, balancing accuracy with computational cost. This approach, known as learning to defer, is essential in domains like natural language generation, medical diagnosis, and computer vision, where an effective deferral can reduce errors at low extra resource consumption. However, the two-stage learning to defer setting, which leverages existing predictors such as a collection of LLMs or other classifiers, often faces challenges due to an expert imbalance problem. This imbalance can lead to suboptimal performance, with deferral algorithms favoring the majority expert. We present a comprehensive study of two-stage learning to defer in expert imbalance settings. We cast the deferral loss optimization as a novel cost-sensitive learning problem over the input-expert domain. We derive new margin-based loss functions and guarantees tailored to this setting, and develop novel algorithms for cost-sensitive learning. Leveraging these results, we design principled deferral algorithms, MILD (Margin-based Imbalanced Learning to Defer), specifically suited for expert imbalance settings. Extensive experiments demonstrate the effectiveness of our approach, showing clear improvements over existing baselines on both image classification and real-world Large Language Model (LLM) routing tasks.