Extraction of informative statistical features in the problem of forecasting time series generated by It{\^{o}}-type processes

arXiv stat.ML / 4/21/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies how to extract highly informative statistical features from time series assumed to come from stochastic processes described by Itô stochastic differential equations with unknown drift and diffusion coefficients.
  • Instead of adding external data, it constructs additional features using parameters from statistically adjusted mixture-type models that capture regularities observed directly in the time series.
  • It proposes algorithms for estimating the underlying Itô coefficients via statistical reconstruction, leveraging separation methods for normal mixtures, yielding both uniform (state-independent) and non-uniform (state-dependent) parameterizations.
  • The non-uniform reconstruction is interpreted as a stochastic analogue of a Taylor expansion, enabling features that account for how coefficients vary with the current process value.
  • Experiments using simple autoregressive prediction (to avoid neural-network-architecture bias) show that including these extracted statistical features improves time-series forecasting performance.

Abstract

In this paper, we consider the problem of extraction of most informative features from time series that are regarded as observed values of stochastic processes satisfying the It{\^{o}} stochastic differential equations with unknown random drift and diffusion coefficients. We do not attract any additional information and use only the information contained in the time series as it is. Therefore, as additional features, we use the parameters of statistically adjusted mixture-type models of the observed regularities of the behavior of the time series. Several algorithms of construction of these parameters are discussed. These algorithms are based on statistical reconstruction of the coefficients which, in turn, is based on statistical separation of normal mixtures. We obtain two types of parameters by the techniques of the uniform and non-uniform statistical reconstruction of the coefficients of the underlying It{\^{o}} process. The reconstructed coefficients obtained by uniform techniques do not depend on the current value of the process, while the non-uniform techniques reconstruct the coefficients with the account of their dependence on the value of the process. Actually, the non-uniform techniques used in this paper represent a stochastic analog of the Taylor expansion for the time series. The efficiency of the obtained additional features is compared by using them in the autoregressive algorithms of prediction of time series. In order to obtain pure conclusion that is not affected by unwanted factors, say, related to a special choice of the architecture of the neural network prediction methods, we used only simple autoregressive algorithms. We show that the use of additional statistical features improves the prediction.