Physics Informed Reinforcement Learning with Gibbs Priors for Topology Control in Power Grids

arXiv cs.LG / 4/3/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces a physics-informed reinforcement learning approach for power-grid topology control, addressing the combinatorial growth of the action space and the high cost of simulating action outcomes.
It combines semi-Markov control with a Gibbs prior that encodes physical system constraints over the action space, so decisions are taken primarily when the grid enters hazardous regimes.
A graph neural network surrogate predicts post-action overload risk, and those predictions are used to construct a state-dependent candidate action set and reweight policy logits for more efficient action selection.
Experiments on three increasingly difficult realistic benchmarks show strong trade-offs between control quality and computational efficiency, including near-oracle performance on simpler tasks and substantial gains over PPO and specialized baselines on harder settings.
Overall, the results suggest the method preserves the flexibility of learned policies while significantly reducing exploration difficulty, online simulation cost, and decision latency for grid topology control.

Abstract

Topology control for power grid operation is a challenging sequential decision making problem because the action space grows combinatorially with the size of the grid and action evaluation through simulation is computationally expensive. We propose a physics-informed Reinforcement Learning framework that combines semi-Markov control with a Gibbs prior, that encodes the system's physics, over the action space. The decision is only taken when the grid enters a hazardous regime, while a graph neural network surrogate predicts the post action overload risk of feasible topology actions. These predictions are used to construct a physics-informed Gibbs prior that both selects a small state-dependent candidate set and reweights policy logits before action selection. In this way, our method reduces exploration difficulty and online simulation cost while preserving the flexibility of a learned policy. We evaluate the approach in three realistic benchmark environments of increasing difficulty. Across all settings, the proposed method achieves a strong balance between control quality and computational efficiency: it matches oracle-level performance while being approximately

6\times

faster on the first benchmark, reaches

94.6\%

of oracle reward with roughly

200\times

lower decision time on the second one, and on the most challenging benchmark improves over a PPO baseline by up to

255\%

in reward and

284\%

in survived steps while remaining about

2.5\times

faster than a strong specialized engineering baseline. These results show that our method provides an effective mechanism for topology control in power grids.

Why I built an AI assistant that doesn't know who you are

Dev.to

DenseNet Paper Walkthrough: All Connected

Towards Data Science

Meta Adaptive Ranking Model: What Instagram Advertisers Gain in 2026 | MKDM

Dev.to

The Facebook insider building content moderation for the AI era

TechCrunch

Qwen3.5 vs Gemma 4: Benchmarks vs real world use?

Reddit r/LocalLLaMA

Physics Informed Reinforcement Learning with Gibbs Priors for Topology Control in Power Grids

Key Points

Abstract

Related Articles

Why I built an AI assistant that doesn't know who you are

DenseNet Paper Walkthrough: All Connected

Meta Adaptive Ranking Model: What Instagram Advertisers Gain in 2026 | MKDM

The Facebook insider building content moderation for the AI era

Qwen3.5 vs Gemma 4: Benchmarks vs real world use?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer