Situationally-Aware Dynamics Learning

arXiv cs.RO / 4/2/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes an online learning framework that enables autonomous robots to infer hidden state representations in real time when unobserved factors affect both robot dynamics and rewards.
  • It formulates the approach as a Generalized Hidden Parameter Markov Decision Process, explicitly modeling how latent parameters influence state transitions and reward structures.
  • The method learns the joint distribution of state transitions to produce an expressive representation of latent ego- and environmental-factors, supporting identification of different operational situations.
  • It uses a multivariate extension of Bayesian Online Changepoint Detection to segment changes in the underlying data-generating process and derives a symbolic “current situation” from recent transition data.
  • Experiments in simulation and on real robots for unstructured terrain navigation show improvements in data efficiency, policy performance, and the development of safer, adaptive navigation strategies.

Abstract

Autonomous robots operating in complex, unstructured environments face significant challenges due to latent, unobserved factors that obscure their understanding of both their internal state and the external world. Addressing this challenge would enable robots to develop a more profound grasp of their operational context. To tackle this, we propose a novel framework for online learning of hidden state representations, with which the robots can adapt in real-time to uncertain and dynamic conditions that would otherwise be ambiguous and result in suboptimal or erroneous behaviors. Our approach is formalized as a Generalized Hidden Parameter Markov Decision Process, which explicitly models the influence of unobserved parameters on both transition dynamics and reward structures. Our core innovation lies in learning online the joint distribution of state transitions, which serves as an expressive representation of latent ego- and environmental-factors. This probabilistic approach supports the identification and adaptation to different operational situations, improving robustness and safety. Through a multivariate extension of Bayesian Online Changepoint Detection, our method segments changes in the underlying data generating process governing the robot's dynamics. The robot's transition model is then informed with a symbolic representation of the current situation derived from the joint distribution of latest state transitions, enabling adaptive and context-aware decision-making. To showcase the real-world effectiveness, we validate our approach in the challenging task of unstructured terrain navigation, where unmodeled and unmeasured terrain characteristics can significantly impact the robot's motion. Extensive experiments in both simulation and real world reveal significant improvements in data efficiency, policy performance, and the emergence of safer, adaptive navigation strategies.