Reversible Residual Normalization Alleviates Spatio-Temporal Distribution Shift

arXiv cs.LG / 4/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses how distribution shift can severely degrade deep forecasting models, extending the problem from individual time series to spatio-temporal data on graphs.
  • It proposes Reversible Residual Normalization (RRN), which uses spatially-aware, invertible transformations to handle distribution drift across both node-level time and network-wide heterogeneity.
  • RRN combines invertible residual blocks with graph convolution operations, integrating Center Normalization and spectral-constrained graph neural networks to model and normalize complex spatio-temporal relationships.
  • The method is bidirectional (learn in a normalized latent space and recover original statistics via an inverse transform) and aims to be robust and model-agnostic for forecasting on dynamic spatio-temporal systems.

Abstract

Distribution shift severely degrades the performance of deep forecasting models. While this issue is well-studied for individual time series, it remains a significant challenge in the spatio-temporal domain. Effective solutions like instance normalization and its variants can mitigate temporal shifts by standardizing statistics. However, distribution shift on a graph is far more complex, involving not only the drift of individual node series but also heterogeneity across the spatial network where different nodes exhibit distinct statistical properties. To tackle this problem, we propose Reversible Residual Normalization (RRN), a novel framework that performs spatially-aware invertible transformations to address distribution shift in both spatial and temporal dimensions. Our approach integrates graph convolutional operations within invertible residual blocks, enabling adaptive normalization that respects the underlying graph structure while maintaining reversibility. By combining Center Normalization with spectral-constrained graph neural networks, our method captures and normalizes complex Spatio-Temporal relationships in a data-driven manner. The bidirectional nature of our framework allows models to learn in a normalized latent space and recover original distributional properties through inverse transformation, offering a robust and model-agnostic solution for forecasting on dynamic spatio-temporal systems.