Topology-Aware Graph Reinforcement Learning for Energy Storage Systems Optimal Dispatch in Distribution Networks

arXiv cs.LG / 3/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses optimal dispatch of energy storage systems in distribution networks by jointly targeting operating cost and voltage security despite time-varying conditions and topology changes.
  • It proposes a topology-aware reinforcement learning framework using TD3, with graph neural networks (GCNs, TAGConv, and GATs) as graph feature encoders to enable fast online decision making.
  • Experiments on 34-bus and 69-bus test systems show that GNN-based controllers reduce both the number and magnitude of voltage violations, with stronger benefits observed for the 69-bus system and under topology reconfiguration.
  • The study reports lower saved costs versus an NLP benchmark for TD3-GCN and TD3-TAGConv on the 69-bus case compared with a neural-network baseline.
  • Cross-system transfer is found to be highly case-dependent, and zero-shot transfer between fundamentally different systems significantly degrades performance and increases voltage violations.

Abstract

Optimal dispatch of energy storage systems (ESSs) in distribution networks involves jointly improving operating economy and voltage security under time-varying conditions and possible topology changes. To support fast online decision making, we develop a topology-aware Reinforcement Learning architecture based on Twin Delayed Deep Deterministic Policy Gradient (TD3), which integrates graph neural networks (GNNs) as graph feature encoders for ESS dispatch. We conduct a systematic investigation of three GNN variants: graph convolutional networks (GCNs), topology adaptive graph convolutional networks (TAGConv), and graph attention networks (GATs) on the 34-bus and 69-bus systems, and evaluate robustness under multiple topology reconfiguration cases as well as cross-system transfer between networks with different system sizes. Results show that GNN-based controllers consistently reduce the number and magnitude of voltage violations, with clearer benefits on the 69-bus system and under reconfiguration; on the 69-bus system, TD3-GCN and TD3-TAGConv also achieve lower saved cost relative to the NLP benchmark than the NN baseline. We also highlight that transfer gains are case-dependent, and zero-shot transfer between fundamentally different systems results in notable performance degradation and increased voltage magnitude violations. This work is available at: https://github.com/ShuyiGao/GNNs_RL_ESSs and https://github.com/distributionnetworksTUDelft/GNNs_RL_ESSs.