Resolving gradient pathology in physics-informed epidemiological models

arXiv cs.LG / 2026/3/26

💬 オピニオンIdeas & Deep AnalysisModels & Research

共有:

要点

The paper addresses training instability in physics-informed neural networks (PINNs) used for epidemiological compartment models like SEIR, where gradients from data loss and physics residual can conflict and cause deadlock or slow convergence.
It proposes a new method, conflict-gated gradient scaling (CGGS), which uses cosine similarity between data and physics gradients to dynamically adjust the penalty weight in a geometric, direction-aware way rather than only rescaling magnitudes.
The method suppresses the physical constraint when gradient directions disagree and re-enables it when they align, effectively prioritizing data fidelity during conflict-heavy phases.
The authors prove that CGGS preserves an $O(1/T)$ convergence rate for smooth non-convex objectives, while convergence guarantees can fail under fixed-weight or magnitude-balanced training when gradients conflict.
Experiments on stiff epidemiological systems show improved parameter estimation, including better peak recovery and faster convergence than magnitude-based baselines, with an emergent curriculum-learning effect.

Abstract

Physics-informed neural networks (PINNs) are increasingly used in mathematical epidemiology to bridge the gap between noisy clinical data and compartmental models, such as the susceptible-exposed-infected-removed (SEIR) model. However, training these hybrid networks is often unstable due to competing optimization objectives. As established in recent literature on ``gradient pathology," the gradient vectors derived from the data loss and the physical residual often point in conflicting directions, leading to slow convergence or optimization deadlock. While existing methods attempt to resolve this by balancing gradient magnitudes or projecting conflicting vectors, we propose a novel method, conflict-gated gradient scaling (CGGS), to address gradient conflicts in physics-informed neural networks for epidemiological modelling, ensuring stable and efficient training and a computationally efficient alternative. This method utilizes the cosine similarity between the data and physics gradients to dynamically modulate the penalty weight. Unlike standard annealing schemes that only normalize scales, CGGS acts as a geometric gate: it suppresses the physical constraint when directional conflict is high, allowing the optimizer to prioritize data fidelity, and restores the constraint when gradients align. We prove that this gating mechanism preserves the standard

O(1/T)

convergence rate for smooth non-convex objectives, a guarantee that fails under fixed-weight or magnitude-balanced training when gradients conflict. We demonstrate that this mechanism autonomously induces a curriculum learning effect, improving parameter estimation in stiff epidemiological systems compared to magnitude-based baselines. Our empirical results show improved peak recovery and convergence over magnitude-based methods.

生成AIで従来型インフラは限界に、IOWN APNで距離と遅延の壁を克服

日経XTECH

生成AIで従来型インフラは限界に、IOWN APNで距離と遅延の壁を克服

日経XTECH

AIによる「同質化のわな」から抜け出せるか、技術戦略責任者が議論

日経XTECH

AIの知能をルール不明のゲームで測定する「ARC-AGI-3」が登場、AIはまだクリアできないが人間には100％クリアできるゲームを実際にプレイ可能

GIGAZINE

テクノロジー「AI警告危険人物」

note

Resolving gradient pathology in physics-informed epidemiological models

要点

Abstract

関連記事

生成AIで従来型インフラは限界に、IOWN APNで距離と遅延の壁を克服

生成AIで従来型インフラは限界に、IOWN APNで距離と遅延の壁を克服

AIによる「同質化のわな」から抜け出せるか、技術戦略責任者が議論

AIの知能をルール不明のゲームで測定する「ARC-AGI-3」が登場、AIはまだクリアできないが人間には100％クリアできるゲームを実際にプレイ可能

テクノロジー「AI警告危険人物」

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer