climt-paraformer: Stable Emulation of Convective Parameterization using a Temporal Memory-aware Transformer

arXiv cs.LG / 4/24/2026

💬 OpinionDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper proposes “climt-paraformer,” a Transformer-based neural network emulator that aims to replicate moist convective sub-grid parameterizations in global climate models more accurately and efficiently.
  • It addresses a key limitation of prior neural emulators by explicitly modeling temporal dependencies (convective “memory”) rather than using only instantaneous, memory-less inputs.
  • Evaluations in a single-column climate model (both offline and online) show the Transformer captures temporal correlations and nonlinear interactions and achieves lower offline errors than baseline memory-less MLP and recurrent LSTM approaches.
  • Sensitivity testing finds an optimal temporal memory length of about 100 minutes, while longer memory can worsen performance.
  • In longer-term coupled climate simulations, the emulator remains stable over 10 years, highlighting practical robustness for climate applications.

Abstract

Accurate representation of moist convective sub-grid-scale processes remains a major challenge in global climate models, as traditional parameterization schemes are both computationally expensive and difficult to scale. Neural network (NN) emulators offer a promising alternative by learning efficient mappings between atmospheric states and convective tendencies while retaining fidelity to the underlying physics. However, most existing NN-based parameterizations are memory-less and rely only on instantaneous inputs, even though convection evolves over time and depends on prior atmospheric states. Recent studies have begun to incorporate convective memory, but they often treat past states as independent features rather than modeling temporal dependencies explicitly. In this work, we develop a temporal memory-aware Transformer emulator for the Emanuel convective parameterization and evaluate it in a single-column climate model (SCM) under both offline and online configurations. The Transformer captures temporal correlations and nonlinear interactions across consecutive atmospheric states. Compared with baseline emulators, including a memory-less multilayer perceptron and a recurrent long short-term memory model, the Transformer achieves lower offline errors. Sensitivity analysis indicates that a memory length of approximately 100 minutes yields the best performance, whereas longer memory degrades performance. We further test the emulator in long-term coupled simulations and show that it remains stable over 10 years. Overall, this study demonstrates the importance of explicit temporal modeling for NN-based parameterizations.