Decentralized Task Scheduling in Distributed Systems: A Deep Reinforcement Learning Approach

arXiv cs.AI / 3/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses scalable task scheduling in heterogeneous distributed systems where workloads change dynamically and multiple QoS/SLA requirements must be balanced, highlighting limits of centralized methods and non-adaptive heuristics.
  • It proposes a decentralized multi-agent deep reinforcement learning framework (DRL-MADRL) formulated as a Dec-POMDP and uses a lightweight actor-critic design.
  • The approach is implemented using only NumPy (plus Matplotlib/SciPy), aiming to make the learning and scheduling deployable on resource-constrained edge devices without heavy ML frameworks.
  • Experiments using workload characteristics from the Google Cluster Trace on a 100-node setup show 15.6% faster average completion time, 15.2% improved energy efficiency, and higher SLA satisfaction (82.3% vs 75.5%), with statistical significance (p < 0.001).
  • The authors provide complete source code and experimental data on GitHub for reproducibility.

Abstract

Efficient task scheduling in large-scale distributed systems presents significant challenges due to dynamic workloads, heterogeneous resources, and competing quality-of-service requirements. Traditional centralized approaches face scalability limitations and single points of failure, while classical heuristics lack adaptability to changing conditions. This paper proposes a decentralized multi-agent deep reinforcement learning (DRL-MADRL) framework for task scheduling in heterogeneous distributed systems. We formulate the problem as a Decentralized Partially Observable Markov Decision Process (Dec-POMDP) and develop a lightweight actor-critic architecture implemented using only NumPy, enabling deployment on resource-constrained edge devices without heavyweight machine learning frameworks. Using workload characteristics derived from the publicly available Google Cluster Trace dataset, we evaluate our approach on a 100-node heterogeneous system processing 1,000 tasks per episode over 30 experimental runs. Experimental results demonstrate 15.6% improvement in average task completion time (30.8s vs 36.5s for random baseline), 15.2% energy efficiency gain (745.2 kWh vs 878.3 kWh), and 82.3% SLA satisfaction compared to 75.5% for baselines, with all improvements statistically significant (p < 0.001). The lightweight implementation requires only NumPy, Matplotlib, and SciPy. Complete source code and experimental data are provided for full reproducibility at https://github.com/danielbenniah/marl-distributed-scheduling.

Decentralized Task Scheduling in Distributed Systems: A Deep Reinforcement Learning Approach | AI Navigate