DynVLA: Learning World Dynamics for Action Reasoning in Autonomous Driving
arXiv cs.CV / 3/12/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- DynVLA introduces Dynamics CoT, a novel reasoning paradigm for autonomous driving that forecasts compact world dynamics before generating actions.
- A Dynamics Tokenizer compresses future evolution into a small set of dynamics tokens to enable physically grounded and latency-efficient decision-making.
- The model decouples ego-centric and environment-centric dynamics to better capture interaction-rich driving scenarios, achieving superior performance over Textual CoT and Visual CoT on NAVSIM, Bench2Drive, and in-house datasets.
- By providing a compact, interpretable representation of world dynamics, DynVLA reduces redundancy compared to dense image predictions while maintaining practical inference latency.




