Can Graph Foundation Models Generalize Over Architecture?

arXiv cs.LG / 3/25/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper examines why current graph foundation models (GFMs) often fail to generalize truly across tasks, pointing to a hidden reliance on fixed GNN architectural backbones.
  • It argues that architecture adaptivity is necessary for “true” GFMs and shows, through theory and controlled experiments, that fixed-backbone approaches underperform when task-specific architectural requirements differ from training-time conditions.
  • As an explicit case study, it uses the concept of “range” (a minimal measurable architectural axis) to demonstrate non-robustness of existing domain-agnostic GFMs to architectural variation.
  • To overcome this, the authors propose an inference-time framework that discovers and mixes task-specific linear graph operators, improving zero-shot generalization without retraining.
  • Experiments on synthetic arbitrary-range tasks and multiple real-world benchmarks show better performance and robustness compared with existing domain-agnostic GFMs.

Abstract

Graph foundation models (GFMs) have recently attracted interest due to the promise of graph neural network (GNN) architectures that generalize zero-shot across graphs of arbitrary scales, feature dimensions, and domains. While existing work has demonstrated this ability empirically across diverse real-world benchmarks, these tasks share a crucial hidden limitation: they admit a narrow set of effective GNN architectures. In particular, current domain-agnostic GFMs rely on fixed architectural backbones, implicitly assuming that a single message-passing regime suffices across tasks. In this paper, we argue that architecture adaptivity is a necessary requirement for true GFMs. We show that existing approaches are non-robust to task-dependent architectural attributes and, as a case study, use range as a minimal and measurable axis along which this limitation becomes explicit. With theoretical analysis and controlled synthetic experiments, we demonstrate that fixed-backbone GFMs provably under-reach on tasks whose architectural requirements differ from those seen at training time. To address this issue, we introduce a framework that adapts effective GNN architecture at inference time by discovering and mixing task-specific linear graph operators, enabling zero-shot generalization across tasks with heterogeneous architectural requirements, without retraining. We validate our approach on arbitrary-range synthetic tasks and a suite of real-world benchmarks, demonstrating improved performance and robustness over existing domain-agnostic GFMs.