An Analysis of the Coordination Gap between Joint and Modular Learning for Job Shop Scheduling with Transportation Resources

arXiv cs.AI / 4/28/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies when “joint training” (simultaneously training job and AGV scheduling agents) is necessary versus “modular training” (independently training agents and integrating them afterward) for job-shop scheduling with transportation resources.
  • It introduces and quantifies a “coordination gap,” measuring the performance difference between the two training modalities via sensitivity analysis of resource scarcity and temporal dominance.
  • Results show joint training can outperform the best dispatching-rule baselines combined with modular training, indicating a real benefit from tighter coordination.
  • The advantage of the coordination gap shrinks in bottleneck environments, especially under severe transport and processing constraints, where modular training becomes a viable alternative.
  • The study provides practical guidance to select the appropriate multi-agent reinforcement learning training strategy based on environmental conditions to maximize scheduling performance.

Abstract

Efficient job-shop scheduling with transportation resources is critical for high-performance manufacturing. With the rise of "decentralized factories", multi-agent reinforcement learning has emerged as a promising approach for the combined scheduling of production and transportation tasks. Prior work has largely focused on developing novel cooperative architectures while overlooking the question of when joint training is necessary. Joint training denotes the simultaneous training of job and automatic guided vehicle scheduling agents, whereas modular training involves independently training each agent followed by post-hoc integration. In this study, we systematically investigate the conditions under which joint training is essential for optimal performance in the job-shop scheduling problem with transportation resources. Through a rigorous sensitivity analysis of resource scarcity and temporal dominance, we quantify the coordination gap -- the performance difference between these two training modalities. In our evaluation, the joint training can produce superior performance compared to the best-performing combinations of dispatching rules and modular training. However, the coordination gap advantage diminishes in bottleneck environments, particularly under severe transport and processing constraints. These findings indicate that modular training represents a viable alternative in environments where a single scheduling task dominates. Overall, our work provides practical guidance for selecting between training modalities based on environmental conditions, enabling decision-makers to optimize reinforcement learning-based scheduling performance.