Receding-Horizon Control via Drifting Models

arXiv cs.AI / 4/7/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper addresses trajectory optimization with unknown system dynamics where learned surrogate models cannot be used to simulate trajectories, and only an offline dataset of trajectories is available.
It proposes “Drifting MPC,” which combines drifting generative models with receding-horizon planning to learn a conditional trajectory distribution supported by the data but biased toward low-cost (optimal) plans.
The authors characterize the learned distribution as the unique optimizer of an objective that explicitly trades off cost optimality against closeness to the offline prior distribution.
Empirical results indicate Drifting MPC can produce near-optimal trajectories while keeping one-step inference efficiency typical of drifting models and achieving faster generation than diffusion-based baselines.

Abstract

We study the problem of trajectory optimization in settings where the system dynamics are unknown and it is not possible to simulate trajectories through a surrogate model. When an offline dataset of trajectories is available, an agent could directly learn a trajectory generator by distribution matching. However, this approach only recovers the behavior distribution in the dataset, and does not in general produce a model that minimizes a desired cost criterion. In this work, we propose Drifting MPC, an offline trajectory optimization framework that combines drifting generative models with receding-horizon planning under unknown dynamics. The goal of Drifting MPC is to learn, from an offline dataset of trajectories, a conditional distribution over trajectories that is both supported by the data and biased toward optimal plans. We show that the resulting distribution learned by Drifting MPC is the unique solution of an objective that trades off optimality with closeness to the offline prior. Empirically, we show that Drifting MPC can generate near-optimal trajectories while retaining the one-step inference efficiency of drifting models and substantially reducing generation time relative to diffusion-based baselines.