OT on the Map: Quantifying Domain Shifts in Geographic Space

arXiv cs.LG / 4/20/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses out-of-domain generalization in geographic computer vision/ML by quantifying how far a deployment region’s data distribution is from the training regions.
  • It introduces GeoSpOT, a method that computes distances between geospatial domains using geographic information and Optimal Transport, aiming to decide whether cross-region adaptation will likely succeed.
  • Experiments show that GeoSpOT distances are strong predictors of transfer difficulty when moving to a new geographic domain.
  • The authors find that embeddings from pretrained location encoders (trained using only longitude-latitude inputs) can provide information comparable to image/text embeddings, enabling approximate out-of-domain performance even without knowing the downstream task.
  • GeoSpOT distances can be used proactively to guide data selection and to predict which regions a geospatial model is likely to underperform on.

Abstract

In computer vision and machine learning for geographic data, out-of-domain generalization is a pervasive challenge, arising from uneven global data coverage and distribution shifts across geographic regions. Though models are frequently trained in one region and deployed in another, there is no principled method for determining when this cross-region adaptation will be successful. A well-defined notion of distance between distributions can effectively quantify how different a new target domain is compared to the domains used for model training, which in turn could support model training and deployment decisions. In this paper, we propose a strategy for computing distances between geospatial domains that leverages geographic information with Optimal Transport methods (GeoSpOT). In our experiments, GeoSpOT distances emerge as effective predictors of cross-domain transfer difficulty. We further demonstrate that embeddings from pretrained location encoders provide information comparable to image/text embeddings, despite relying solely on longitude-latitude pairs as input. This allows users to get an approximation of out-of-domain performance for geospatial models, even when the exact downstream task is unknown, or no task-specific data is available. Building on these findings, we show that GeoSpOT distances can preemptively guide data selection and enable predictive tools to analyze regions where a model is likely to underperform.