CoordLight: Learning Decentralized Coordination for Network-Wide Traffic Signal Control

arXiv cs.LG / 3/26/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • 本論文は、分散環境での部分観測と協調の課題に着目し、ネットワーク全体の信号制御へ拡張可能なMARLフレームワーク「CoordLight」を提案しています。
  • CoordLightでは、車両待ち行列モデルに基づく新しい状態表現「QDSE」により、各交差点エージェントが局所交通の動態を分析・予測して適切に応答できるようにしています。
  • 協調学習のために、隣接エージェント間の状態・行動依存関係を注意機構で識別し、影響の大きい近隣との相互作用を優先する「Neighbor-aware Policy Optimization (NAPO)」を導入しています。
  • 実データ3種類の交通ネットワーク(最大196交差点)で、既存の最先端信号制御手法に対して一貫して高い性能を示したと報告しています。
  • 実装コードがGitHubで公開されており、研究者が再現・発展させやすい形で提供されています。

Abstract

Adaptive traffic signal control (ATSC) is crucial in alleviating congestion, maximizing throughput and promoting sustainable mobility in ever-expanding cities. Multi-Agent Reinforcement Learning (MARL) has recently shown significant potential in addressing complex traffic dynamics, but the intricacies of partial observability and coordination in decentralized environments still remain key challenges in formulating scalable and efficient control strategies. To address these challenges, we present CoordLight, a MARL-based framework designed to improve intra-neighborhood traffic by enhancing decision-making at individual junctions (agents), as well as coordination with neighboring agents, thereby scaling up to network-level traffic optimization. Specifically, we introduce the Queue Dynamic State Encoding (QDSE), a novel state representation based on vehicle queuing models, which strengthens the agents' capability to analyze, predict, and respond to local traffic dynamics. We further propose an advanced MARL algorithm, named Neighbor-aware Policy Optimization (NAPO). It integrates an attention mechanism that discerns the state and action dependencies among adjacent agents, aiming to facilitate more coordinated decision-making, and to improve policy learning updates through robust advantage calculation. This enables agents to identify and prioritize crucial interactions with influential neighbors, thus enhancing the targeted coordination and collaboration among agents. Through comprehensive evaluations against state-of-the-art traffic signal control methods over three real-world traffic datasets composed of up to 196 intersections, we empirically show that CoordLight consistently exhibits superior performance across diverse traffic networks with varying traffic flows. The code is available at https://github.com/marmotlab/CoordLight