AI Navigate

TerraFlow: Multimodal, Multitemporal Representation Learning for Earth Observation

arXiv cs.CV / 3/16/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • TerraFlow is a novel multimodal, multitemporal learning approach for Earth observation that integrates data from different modalities and time steps.
  • It introduces temporal training objectives that enable sequence-aware learning across space, time, and modality while remaining robust to variable-length inputs common in real-world EO data.
  • In experiments, TerraFlow outperforms state-of-the-art foundation models on all temporal tasks in the GEO-Bench-2 benchmark.
  • The work demonstrates initial steps toward deep-learning based risk map prediction for natural disasters, a task where other top models frequently collapse.
  • TerraFlow achieves up to 50% higher F1 score and 24% lower Brier score compared with baselines.

Abstract

We propose TerraFlow, a novel approach to multimodal, multitemporal learning for Earth observation. TerraFlow builds on temporal training objectives that enable sequence-aware learning across space, time, and modality, while remaining robust to the variable-length inputs commonly encountered in real-world Earth observation data. Our experiments demonstrate superiority of TerraFlow over state-of-the-art foundation models for Earth observation across all temporal tasks of the GEO-Bench-2 benchmark. We additionally demonstrate that TerraFlow is able to make initial steps towards deep-learning based risk map prediction for natural disasters -- a task on which other state-of-the-art foundation models frequently collapse. TerraFlow outperforms state-of-the-art foundation models by up to 50% in F1 score and 24% in Brier score.