Online Distributional Regression

arXiv stat.ML / 4/27/2026

💬 OpinionTools & Practical UsageModels & Research

Key Points

  • The paper addresses how to perform online learning for large-scale streaming data when probabilistic forecasting is needed, including learning conditional heteroskedasticity and higher conditional moments.
  • It proposes an online estimation method for regularized, linear distributional models by combining advances in online LASSO estimation with the GAMLSS (Generalized Additive Models for Location, Scale, and Shape) framework.
  • The authors demonstrate the approach via a case study in day-ahead electricity price forecasting, showing competitive predictive performance.
  • They also report a strongly reduced computational effort using incremental estimation, and provide an efficient Python implementation in a package called ondil.

Abstract

Large-scale streaming data are common in modern machine learning applications and have led to the development of online learning algorithms. Many fields, such as supply chain management, weather and meteorology, energy markets, and finance, have pivoted toward probabilistic forecasting. This results in the need not only for accurate learning of the expected value but also for learning the conditional heteroskedasticity and conditional moments. Against this backdrop, we present a methodology for online estimation of regularized, linear distributional models. The proposed algorithm combines recent developments in online estimation of LASSO models with the well-known GAMLSS framework. We provide a case study on day-ahead electricity price forecasting, in which we show the competitive performance of the incremental estimation combined with strongly reduced computational effort. Our algorithms are implemented in a computationally efficient Python package ondil.