Distill-Belief: Closed-Loop Inverse Source Localization and Characterization in Physical Fields

arXiv cs.AI / 4/30/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • Closed-loop inverse source localization and characterization (ISLC) demands that a mobile agent choose informative measurements quickly while estimating latent field parameters under tight time constraints.
  • The key difficulty is that fast learned belief models can produce reward hacking—optimizing the objective by exploiting approximation errors instead of genuinely reducing uncertainty.
  • The paper introduces Distill-Belief, a teacher–student framework where a Bayes-correct particle-filter teacher provides a dense information-gain signal and posterior, while a compact student distills this into belief statistics plus an uncertainty certificate for stopping.
  • During deployment, only the student model is used, giving constant per-step computational cost and avoiding reliance on expensive Bayesian inference at runtime.
  • Experiments across seven field modalities and stress tests show improved sensing efficiency and success rates, better posterior contraction and estimation accuracy, and reduced reward hacking versus baseline methods.

Abstract

{Closed-loop inverse source localization and characterization (ISLC) requires a mobile agent to select measurements that localize sources and infer latent field parameters under strict time constraints.} {The core challenge lies in the belief-space objective: valid uncertainty estimation requires expensive Bayesian inference, whereas using fast learned belief model leads to reward hacking, in which the policy exploits approximation errors rather than actually reducing uncertainty.} {We propose \textbf{Distill-Belief}, a teacher--student framework that decouples correctness from efficiency. A Bayes-correct particle-filter teacher maintains the posterior and supplies a dense information-gain signal, while a compact student distills the posterior into belief statistics for control and an uncertainty certificate for stopping. At deployment, only the student is used, yielding constant per-step cost.} {Experiments on seven field modalities and two stress tests show that Distill-Belief consistently reduces sensing cost and improves success, posterior contraction, and estimation accuracy over baselines, while mitigating reward hacking.}