Global Attention with Linear Complexity for Exascale Generative Data Assimilation in Earth System Prediction

arXiv cs.LG / 4/21/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

The paper presents a one-stage generative data assimilation (DA) framework that turns DA into Bayesian posterior sampling rather than the traditional forecast-update cycle.
It introduces STORM, a spatiotemporal transformer designed to remove the quadratic attention bottleneck by using a global-attention linear-complexity scaling algorithm.
The authors report strong GPU scalability on Frontier: running on 32,768 GPUs achieves 63% strong scaling efficiency and 1.6 ExaFLOP sustained performance.
The method is scaled up to 20 billion spatiotemporal tokens, enabling km-scale global modeling across 177k temporal frames, which the authors say was previously out of reach.
The work targets a key bottleneck in exascale Earth system prediction—scalable, accurate inference—aiming to improve uncertainty quantification and prediction of extreme events.

Abstract

Accurate weather and climate prediction relies on data assimilation (DA), which estimates the Earth system state by integrating observations with models. While exascale computing has significantly advanced earth simulation, scalable and accurate inference of the Earth system state remains a fundamental bottleneck, limiting uncertainty quantification and prediction of extreme events. We introduce a unified one-stage generative DA framework that reformulates assimilation as Bayesian posterior sampling, replacing the conventional forecast-update cycle with compute-dense, GPU-efficient inference. At the core is STORM, a novel spatiotemporal transformer with a global attention linear-complexity scaling algorithm that breaks the quadratic attention barrier. On 32,768 GPUs of the Frontier supercomputer, our method achieves 63% strong scaling efficiency and 1.6 ExaFLOP sustained performance. We further scale to 20 billion spatiotemporal tokens, enabling km-scale global modeling over 177k temporal frames, regimes previously unreachable, establishing a new paradigm for Earth system prediction.

A practical guide to getting comfortable with AI coding tools

Dev.to

We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆

Dev.to

Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)

Dev.to

🚀 Major BrowserAct CLI Update

Dev.to

Building AgentOS: Why I’m Building the AWS Lambda for Insurance Claims

Dev.to

Global Attention with Linear Complexity for Exascale Generative Data Assimilation in Earth System Prediction

Key Points

Abstract

Related Articles

A practical guide to getting comfortable with AI coding tools

We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆

Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)

🚀 Major BrowserAct CLI Update

Building AgentOS: Why I’m Building the AWS Lambda for Insurance Claims

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer