BITS for GAPS: Bayesian Information-Theoretic Sampling for hierarchical GAussian Process Surrogates

arXiv stat.ML / 3/24/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes BITS for GAPS, a framework for information-theoretic experimental design of hierarchical Gaussian-process surrogate models that explicitly accounts for hyperparameter uncertainty.
  • Rather than using fixed or point-estimated hyperparameters in acquisition functions, the method uses Bayesian hierarchical modeling to propagate uncertainty from both the latent GP function and its hyperparameters into the sampling criterion.
  • The authors derive theoretical support, including a closed-form approximation and a lower bound on posterior differential entropy, to characterize the information gain used for adaptive sampling.
  • In a vapor-liquid equilibrium hybrid modeling case study, the approach improves expected information gain and predictive accuracy by preferentially sampling regions with high uncertainty in the Wilson activity model.
  • The results demonstrate how partial physical knowledge can be encoded through hierarchical priors in a GP surrogate and then used to inform downstream distillation design.

Abstract

We introduce Bayesian Information-Theoretic Sampling for hierarchical GAussian Process Surrogates (BITS for GAPS), a framework enabling information-theoretic experimental design of Gaussian process-based surrogate models. Unlike standard methods, which use fixed or point-estimated hyperparameters in acquisition functions, our approach propagates hyperparameter uncertainty into the sampling criterion through Bayesian hierarchical modeling. In this framework, a latent function receives a Gaussian process prior, while hyperparameters are assigned additional priors to capture the modeler's knowledge of the governing physical phenomena. Consequently, the acquisition function incorporates uncertainties from both the latent function and its hyperparameters, ensuring that sampling is guided by both data scarcity and model uncertainty. We further establish theoretical results in this context: a closed-form approximation and a lower bound of the posterior differential entropy. We demonstrate the framework's utility for hybrid modeling with a vapor-liquid equilibrium case study. Specifically, we build a surrogate model for latent activity coefficients in a binary mixture. We construct a hybrid model by embedding the surrogate into an extended form of Raoult's law. This hybrid model then informs distillation design. This case study shows how partial physical knowledge can be translated into a hierarchical Gaussian process surrogate. It also shows that using BITS for GAPS increases expected information gain and predictive accuracy by targeting high-uncertainty regions of the Wilson activity model. Overall, BITS for GAPS is a generalized uncertainty-aware framework for adaptive data acquisition in complex physical systems.