Uncertainty-Aware Predictive Safety Filters for Probabilistic Neural Network Dynamics

arXiv cs.LG / 4/30/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces Uncertainty-Aware Predictive Safety Filter (UPSi), which upgrades predictive safety filters by using probabilistic ensemble (PE) neural network dynamics rather than limited first-principles models or Gaussian processes.
  • UPSi formulates future outcomes as reachable sets and adds an explicit certainty constraint to prevent “model exploitation” and improve the rigor of uncertainty quantification.
  • The method is designed to integrate directly into standard model-based reinforcement learning (MBRL) workflows, particularly Dyna-style MBRL setups.
  • Experiments on common safe RL benchmarks show substantial improvements in exploration safety compared with prior neural-network-based predictive safety filters, while keeping performance roughly on par with standard MBRL.
  • Overall, UPSi is positioned as a bridge between the scalability/general applicability of modern probabilistic MBRL and the formal safety guarantees of predictive safety filters.

Abstract

Predictive safety filters (PSFs) leverage model predictive control to enforce constraint satisfaction during deep reinforcement learning (RL) exploration, yet their reliance on first-principles models or Gaussian processes limits scalability and broader applicability. Meanwhile, model-based RL (MBRL) methods routinely employ probabilistic ensemble (PE) neural networks to capture complex, high-dimensional dynamics from data with minimal prior knowledge. However, existing attempts to integrate PEs into PSFs lack rigorous uncertainty quantification. We introduce the Uncertainty-Aware Predictive Safety Filter (UPSi), a PSF that provides rigorous safety predictions using PE dynamics models by formulating future outcomes as reachable sets. UPSi introduces an explicit certainty constraint that prevents model exploitation and integrates seamlessly into common MBRL frameworks. We evaluate UPSi within Dyna-style MBRL on standard safe RL benchmarks and report substantial improvements in exploration safety over prior neural network PSFs while maintaining performance on par with standard MBRL. UPSi bridges the gap between the scalability and generality of modern MBRL and the safety guarantees of predictive safety filters.