MISTY: High-Throughput Motion Planning via Mixer-based Single-step Drifting

arXiv cs.RO / 4/24/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • MISTY is a new single-step, high-throughput generative motion planner for autonomous driving that avoids the iterative neural evaluations that cause diffusion-based planners’ high latency.
  • The approach combines a vectorized Sub-Graph encoder for environment context, a VAE that compresses expert trajectories into a 32-dimensional latent space, and an MLP-Mixer decoder to remove the quadratic complexity of attention.
  • MISTY introduces a latent-space drifting loss that moves most of the complex distribution evolution into training, enabling faster inference while improving generalization.
  • By modeling explicit attractive and repulsive “forces” in latent space, the method can generate proactive maneuvers like active overtaking that are rare in the original expert demonstrations.
  • On the nuPlan benchmark (Test14-hard), MISTY reports state-of-the-art closed-loop performance with scores of 80.32 (non-reactive) and 82.21 (reactive), running at over 99 FPS and 10.1 ms end-to-end latency—about an order-of-magnitude faster than iterative diffusion planners.

Abstract

Multi-modal trajectory generation is essential for safe autonomous driving, yet existing diffusion-based planners suffer from high inference latency due to iterative neural function evaluations. This paper presents MISTY (Mixer-based Inference for Single-step Trajectory-drifting Yield), a high-throughput generative motion planner that achieves state-of-the-art closed-loop performance with pure single-step inference. MISTY integrates a vectorized Sub-Graph encoder to capture environment context, a Variational Autoencoder to structure expert trajectories into a compact 32-dimensional latent manifold, and an ultra-lightweight MLP-Mixer decoder to eliminate quadratic attention complexity. Importantly, we introduce a latent-space drifting loss that shifts the complex distribution evolution entirely to the training phase. By formulating explicit attractive and repulsive forces, this mechanism empowers the model to synthesize novel, proactive maneuvers, such as active overtaking, that are virtually absent from the raw expert demonstrations. Extensive evaluations on the nuPlan benchmark demonstrate that MISTY achieves state-of-the-art results on the challenging Test14-hard split, with comprehensive scores of 80.32 and 82.21 in non-reactive and reactive settings, respectively. Operating at over 99 FPS with an end-to-end latency of 10.1 ms, MISTY offers an order-of-magnitude speedup over iterative diffusion planners while while achieving significantly robust generation.