Scalable Production Scheduling: Linear Complexity via Unified Homogeneous Graphs

arXiv cs.LG / 4/28/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper presents a unified, feature-based “homogenization” graph framework for Job Shop Scheduling that avoids scalability bottlenecks seen in prior RL models caused by heterogeneous architecture or quadratic graph complexity.
  • By projecting different node roles into a shared latent space, it enables a standard homogeneous Graph Isomorphism Network to model resource contention with linear-complexity behavior for low-latency inference.
  • Experiments report state-of-the-art scheduling performance along with consistent zero-shot generalization across instance settings.
  • The authors find that the job-to-machine ratio (not absolute problem size) primarily determines policy effectiveness, and they propose a “structural saturation” hypothesis where training at critical congestion (J≈M) yields scale-invariant conflict-resolution strategies.
  • This saturation-trained approach is claimed to reduce the need for expensive scale-specific retraining and to mitigate overfitting to statistical shortcuts, supporting robust deployment in dynamic industrial production environments.

Abstract

Efficiently solving the Job Shop Scheduling Problem in real-world industrial applications requires policies that are both computationally lean and topologically robust. While Reinforcement Learning has shown potential in automating dispatching rules, existing models often struggle with a scalability bottleneck caused by quadratic graph complexity or the architectural overhead of heterogeneous layers. We introduce a unified graph framework that employs feature-based homogenization to project distinct node roles into a shared latent space. This allows a standard homogeneous Graph Isomorphism Network to capture complex resource contention with linear complexity, ensuring low-latency inference for large-scale industrial applications. Our empirical results demonstrate that our framework achieves state-of-the-art performance while exhibiting consistent zero-shot generalization. We identify the job-to-machine ratio as the primary driver of policy effectiveness, rather than absolute problem size. Based on this, we propose a hypothesis of structural saturation, demonstrating that policies trained on critically congested instances (\mathcal{J} \approx \mathcal{M}) learn scale-invariant resolution strategies. Agents trained at this saturation point internalize invariant conflict-resolution logic, allowing them to treat massive rectangular instances as a sequential concatenation of saturated sub-problems. This approach eliminates the need for expensive scale-specific retraining and prevents overfitting to statistical shortcuts, providing a robust and efficient pathway for deploying RL solutions in dynamic production environments.