Bayesian Cosmic Void Finding with Graph Flows

arXiv stat.ML / 4/20/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses the challenge of reliably finding cosmic voids from sparse galaxy surveys, noting that void identification is underconstrained and should be treated probabilistically rather than deterministically.
  • It proposes a probabilistic void-finding approach that samples from the stochastic mapping between observed galaxy catalogs and user-defined void definitions.
  • The method uses a deep graph neural network that evolves “test particles” via a flow-matching objective to generate void catalogs as samples from the desired distribution.
  • In experiments on a simplified setting trained from a deterministic teacher, the model shows substantial stochasticity interpreted as regularization and produces void catalogs whose cosmological information can outperform the teacher.
  • Beyond emulating existing void finders cheaply, the approach aims to learn the Bayes-optimal mapping for arbitrary void definitions, including voids defined in terms of simulated matter density and velocity fields, and outlines steps toward practical deployment.

Abstract

Cosmic voids contain higher-order cosmological information and are of interest for astroparticle physics. Finding genuine matter underdensities in sparse galaxy surveys is, however, an underconstrained problem. Traditional void finding algorithms produce deterministic void catalogs, neglecting the probabilistic nature of the problem. We present a method to sample from the stochastic mapping from galaxy catalogs to arbitrary void definitions. Our algorithm uses a deep graph neural network to evolve "test particles" according to a flow-matching objective. We demonstrate the method in a simplified example setting but outline steps to generalize it towards practically usable void finders. Trained on a deterministic teacher, the model performs well but has considerable stochasticity which we interpret as regularization. Cosmological information in the predicted void catalogs outperforms the teacher. On the one hand, our method can cheaply emulate existing void finders with apparently useful regularization. More importantly, it also allows us to find the Bayes-optimal mapping between observed galaxies and any void definition. This includes definitions operating at the level of simulated matter density and velocity fields.