AI Navigate

Flood Risk Follows Valleys, Not Grids: Graph Neural Networks for Flash Flood Susceptibility Mapping in Himachal Pradesh with Conformal Uncertainty Quantification

arXiv cs.LG / 3/18/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • A Graph Neural Network (GraphSAGE) on a watershed connectivity graph outperforms pixel-based models for flash flood susceptibility mapping in Himachal Pradesh, achieving AUC 0.978 ± 0.017 versus 0.881 baseline.
  • The study uses 6-year Sentinel-1 SAR flood inventory and 12 environmental variables at 30 m resolution; evaluates with leave-one-basin-out cross-validation to avoid over-optimistic splits.
  • Conformal prediction provides the first HP susceptibility maps with 90% coverage intervals, offering statistically guaranteed uncertainty bounds.
  • High-susceptibility zones overlap critical infrastructure: highways, bridges, hydroelectric installations; indicates practical relevance for planning and risk management.
  • Found that SAR label noise reduces coverage in high-risk areas (45-59%), suggesting future improvements in data labeling.

Abstract

Flash floods are the most destructive natural hazard in Himachal Pradesh (HP), India, causing over 400 fatalities and $1.2 billion in losses in the 2023 monsoon season alone. Existing risk maps treat every pixel independently, ignoring the basic fact that flooding upstream raises risk downstream. We address this with a Graph Neural Network (GraphSAGE) trained on a watershed connectivity graph (460 sub-watersheds, 1,700 directed edges), built from a six-year Sentinel-1 SAR flood inventory (2018-2023, 3,000 events) and 12 environmental variables at 30 m resolution. Four pixel-based ML models (RF, XGBoost, LightGBM, stacking ensemble) serve as baselines. All models are evaluated with leave-one-basin-out spatial cross-validation to avoid the 5-15% AUC inflation of random splits. Conformal prediction produces the first HP susceptibility maps with statistically guaranteed 90% coverage intervals. The GNN achieved AUC = 0.978 +/- 0.017, outperforming the best baseline (AUC = 0.881) and the published HP benchmark (AUC = 0.88). The +0.097 gain confirms that river connectivity carries predictive signal that pixel-based models miss. High-susceptibility zones overlap 1,457 km of highways (including 217 km of the Manali-Leh corridor), 2,759 bridges, and 4 major hydroelectric installations. Conformal intervals achieved 82.9% empirical coverage on the held-out 2023 test set; lower coverage in high-risk zones (45-59%) points to SAR label noise as a target for future work.