From Mice to Trains: Amortized Bayesian Inference on Graph Data

arXiv stat.ML / 5/5/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes adapting Amortized Bayesian Inference (ABI) to graph-structured data to enable fast, likelihood-free posterior inference across node-, edge-, and graph-level parameters.
  • It addresses key graph inference challenges by using permutation-invariant graph encoders coupled with neural posterior estimators that can scale across different graph sizes and sparsity levels.
  • The proposed two-module pipeline uses a summary network to convert attributed graphs into fixed-length representations, followed by an inference network that approximates the posterior over parameters.
  • The authors evaluate multiple candidate summary-network architectures on controlled synthetic data and two real-world domains—biology and logistics—focusing on recovery quality and calibration.
  • The work positions generative neural networks within a Bayesian simulation-based framework to capture complex long-range dependencies typical of graph data.

Abstract

Graphs arise across diverse domains, from biology and chemistry to social and information networks, as well as in transportation and logistics. Inference on graph-structured data requires methods that are permutation-invariant, scalable across varying sizes and sparsities, and capable of capturing complex long-range dependencies, making posterior estimation on graph parameters particularly challenging. Amortized Bayesian Inference (ABI) is a simulation-based framework that employs generative neural networks to enable fast, likelihood-free posterior inference. We adapt ABI to graph data to address these challenges to perform inference on node-, edge-, and graph-level parameters. Our approach couples permutation-invariant graph encoders with flexible neural posterior estimators in a two-module pipeline: a summary network maps attributed graphs to fixed-length representations, and an inference network approximates the posterior over parameters. In this setting, several neural architectures can serve as the summary network. In this work we evaluate multiple architectures and assess their performance on controlled synthetic settings and two real-world domains - biology and logistics - in terms of recovery and calibration.