Score-based generative emulation of impact-relevant Earth system model outputs

arXiv stat.ML / 4/14/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes score-based diffusion generative emulators that can replicate the joint distribution of impact-relevant climate variables from Earth System Models (near-surface temperature, precipitation, humidity, and wind).
  • It is designed to generate emulator outputs that can feed downstream impact models, aiming to support faster exploration of evolving policy scenarios than traditional Coupled Model Intercomparison Project cycles.
  • The method operates on a spherical mesh and can run on a single mid-range GPU, and the study introduces diagnostics comparing emulator outputs to parent ESMs via probability densities, cross-variable correlations, time of emergence, and tail behavior.
  • Evaluations across three different ESMs and both pre-industrial and forced regimes show close distributional matching and correct capture of key forced responses, while also identifying failure cases tied to strong seasonal regime shifts.
  • The authors conclude that inaccuracies are small compared with internal variability, and outline future work for daily resolution, higher spatial fidelity, and bias-aware training, with code released on GitHub.

Abstract

Policy targets evolve faster than the Coupled Model Intercomparison Project cycles, complicating adaptation and mitigation planning that must often contend with outdated projections. Climate model output emulators address this gap by offering inexpensive surrogates that can rapidly explore alternative futures while staying close to Earth System Model (ESM) behavior. The focus is on emulators designed to provide inputs to impact models. Using monthly ESM fields of near-surface temperature, precipitation, relative humidity, and wind speed, it is shown that deep generative models have the potential to model the joint distribution of variables relevant for impacts. The specific model proposed uses score-based diffusion on a spherical mesh and runs on a single mid-range graphical processing unit. A thorough suite of diagnostics is introduced to compare emulator outputs with their parent ESMs, including their probability densities, cross-variable correlations, time of emergence, or tail behavior. The emulator performance is evaluated across three distinct ESMs in both pre-industrial and forced regimes. The results show that the emulator produces distributions that closely match the ESM outputs and captures key forced responses. They also reveal important failure cases, notably for variables with a strong regime shift in the seasonal cycle. Although not a perfect match to the ESM, the inaccuracies of the emulator are small relative to the magnitude of internal variability in ESM projections. This suggests that the generative emulators can be useful in supporting impact assessment. Priorities for future development toward daily resolution, finer spatial scales, and bias-aware training are discussed. Code is made available at https://github.com/shahineb/climemu.