Deep Variational Inference Symbolic Regression

arXiv cs.LG / 5/5/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces Deep Variational Inference Symbolic Regression (DVISR), which extends Deep Symbolic Regression (DSR) with variational Bayesian methods to estimate uncertainty rather than only producing a single best equation.
  • DVISR replaces DSR’s reward signal with the evidence lower bound (ELBO) integrand, aligning training with probabilistic inference over symbolic models.
  • The method modifies the neural network to output probability distributions over constants in symbolic expressions, enabling posterior inference over both the expression structure (trees) and numerical parameters.
  • Experiments show DVISR can recover the true posterior in simplified scenarios, including cases with and without explicit constant tokens.
  • The authors evaluate how performance degrades or changes as the expression search space grows, arguing this is a step toward scalable Bayesian symbolic regression with full-model uncertainty.

Abstract

Symbolic regression discovers explicit, interpretable equations without assuming a functional form in advance. A Bayesian approach strengthens this through probability distributions over candidate expressions, thus quantifying uncertainty in the presence of noisy and limited data. Deep Symbolic Regression (DSR) uses a neural network to generate symbolic expressions, but it is designed to identify a single best-fitting expression rather than infer a posterior distribution over models. We introduce Deep Variational Inference Symbolic Regression (DVISR), a variational Bayesian extension of DSR. DVISR replaces the original reward with the integrand of the evidence lower bound. It also extends the network architecture to output distributions over constants within expressions, enabling posterior inference over both expression trees and their associated constants. We show that DVISR can recover the true posterior in simple settings, both with and without constant tokens, and we examine how its performance changes as the size of the expression space increases. These results position DVISR as a step toward scalable Bayesian symbolic regression with uncertainty over full symbolic models.