Time-Warping Recurrent Neural Networks for Transfer Learning

arXiv cs.LG / 4/6/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a time-warping based transfer learning method for Recurrent Neural Networks (especially LSTMs) to adapt models across dynamical systems that evolve on different time scales.
  • It shows theoretically that for time-lag models (a class of linear first-order differential equations), an LSTM can approximate the systems to arbitrary accuracy and can be time-warped without losing that approximation quality.
  • The method is evaluated on wildfire-related fuel moisture content (FMC) prediction, using pretrained RNNs at a 10-hour characteristic time scale and adapting them to 1-hour, 100-hour, and 1000-hour regimes.
  • Compared with several established transfer learning approaches, time-warping achieves prediction accuracy comparable to prior methods while updating only a small fraction of parameters.

Abstract

Dynamical systems describe how a physical system evolves over time. Physical processes can evolve faster or slower in different environmental conditions. We use time-warping as rescaling the time in a model of a physical system. This thesis proposes a new method of transfer learning for Recurrent Neural Networks (RNNs) based on time-warping. We prove that for a class of linear, first-order differential equations known as time lag models, an LSTM can approximate these systems with any desired accuracy, and the model can be time-warped while maintaining the approximation accuracy. The Time-Warping method of transfer learning is then evaluated in an applied problem on predicting fuel moisture content (FMC), an important concept in wildfire modeling. An RNN with LSTM recurrent layers is pretrained on fuels with a characteristic time scale of 10 hours, where there are large quantities of data available for training. The RNN is then modified with transfer learning to generate predictions for fuels with characteristic time scales of 1 hour, 100 hours, and 1000 hours. The Time-Warping method is evaluated against several known methods of transfer learning. The Time-Warping method produces predictions with an accuracy level comparable to the established methods, despite modifying only a small fraction of the parameters that the other methods modify.