Goal-Oriented Reactive Simulation for Closed-Loop Trajectory Prediction

arXiv cs.RO / 3/26/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that trajectory prediction models trained in open-loop settings suffer from covariate shift and compounding errors when used in real closed-loop deployments.
  • It proposes an on-policy, goal-oriented closed-loop training paradigm with a transformer-based scene decoder to make simulation inherently reactive, including self-induced states.
  • The approach trains the ego predictor to recover from its own execution errors by mixing open-loop data with simulated states generated from the model’s behavior.
  • Experiments report substantial improvements in collision avoidance at high replanning frequencies, with relative collision rate reductions up to 27.0% on nuScenes and 79.5% in dense DeepScenario intersections versus open-loop baselines.
  • It also finds that a hybrid simulator that combines reactive and non-reactive surrounding agents can balance short-term interactivity with longer-term behavioral stability.

Abstract

Current trajectory prediction models are primarily trained in an open-loop manner, which often leads to covariate shift and compounding errors when deployed in real-world, closed-loop settings. Furthermore, relying on static datasets or non-reactive log-replay simulators severs the interactive loop, preventing the ego agent from learning to actively negotiate surrounding traffic. In this work, we propose an on-policy closed-loop training paradigm optimized for high-frequency, receding horizon ego prediction. To ground the ego prediction in a realistic representation of traffic interactions and to achieve reactive consistency, we introduce a goal-oriented, transformer-based scene decoder, resulting in an inherently reactive training simulation. By exposing the ego agent to a mixture of open-loop data and simulated, self-induced states, the model learns recovery behaviors to correct its own execution errors. Extensive evaluation demonstrates that closed-loop training significantly enhances collision avoidance capabilities at high replanning frequencies, yielding relative collision rate reductions of up to 27.0% on nuScenes and 79.5% in dense DeepScenario intersections compared to open-loop baselines. Additionally, we show that a hybrid simulation combining reactive with non-reactive surrounding agents achieves optimal balance between immediate interactivity and long-term behavioral stability.