Fatigue-Aware Learning to Defer via Constrained Optimisation

arXiv cs.LG / 4/2/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces FALCON, a fatigue-aware learning-to-defer (L2D) method that accounts for workload-dependent human performance degradation rather than assuming static expert accuracy.
  • FALCON formulates L2D as a constrained Markov decision process (CMDP) with a state that includes task features plus cumulative human workload, then optimizes accuracy subject to cooperation/coverage budgets using PPO-Lagrangian training.
  • The authors propose FA-L2D, a benchmark that varies fatigue dynamics from near-static to rapidly degrading regimes to test robustness under different human fatigue patterns.
  • Experiments on multiple datasets indicate FALCON improves over existing L2D approaches across different coverage levels, generalizes zero-shot to unseen experts with different fatigue behaviors, and shows adaptive human-AI collaboration beats AI-only or human-only when coverage is between 0 and 1.

Abstract

Learning to defer (L2D) enables human-AI cooperation by deciding when an AI system should act autonomously or defer to a human expert. Existing L2D methods, however, assume static human performance, contradicting well-established findings on fatigue-induced degradation. We propose Fatigue-Aware Learning to Defer via Constrained Optimisation (FALCON), which explicitly models workload-varying human performance using psychologically grounded fatigue curves. FALCON formulates L2D as a Constrained Markov Decision Process (CMDP) whose state includes both task features and cumulative human workload, and optimises accuracy under human-AI cooperation budgets via PPO-Lagrangian training. We further introduce FA-L2D, a benchmark that systematically varies fatigue dynamics from near-static to rapidly degrading regimes. Experiments across multiple datasets show that FALCON consistently outperforms state-of-the-art L2D methods across coverage levels, generalises zero-shot to unseen experts with different fatigue patterns, and demonstrates the advantage of adaptive human-AI collaboration over AI-only or human-only decision-making when coverage lies strictly between 0 and 1.