AI Navigate

Senna-2: Aligning VLM and End-to-End Driving Policy for Consistent Decision Making and Planning

arXiv cs.CV / 3/13/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • Senna-2 is a new VLM-E2E driving policy that explicitly aligns high-level VLM decisions with low-level E2E planning to ensure consistent decision-making and trajectory planning.
  • It introduces a three-stage training paradigm: driving pre-training with a decision adapter, open-loop VLM-E2E alignment, and closed-loop bottom-up hierarchical reinforcement learning in 3DGS environments to reinforce safety and efficiency.
  • The approach reports significant gains, including a 19.3% F1 score improvement for dual-system consistency, a 5.7% FDE reduction in open-loop settings, and a 30.6% AF-CR reduction in closed-loop settings.
  • The results indicate improved top-down guidance and decision-following, leading to more reliable trajectories and improved driving safety and efficiency.

Abstract

Vision-language models (VLMs) enhance the planning capability of end-to-end (E2E) driving policy by leveraging high-level semantic reasoning. However, existing approaches often overlook the dual-system consistency between VLM's high-level decision and E2E's low-level planning. As a result, the generated trajectories may misalign with the intended driving decisions, leading to weakened top-down guidance and decision-following ability of the system. To address this issue, we propose Senna-2, an advanced VLM-E2E driving policy that explicitly aligns the two systems for consistent decision-making and planning. Our method follows a consistency-oriented three-stage training paradigm. In the first stage, we conduct driving pre-training to achieve preliminary decision-making and planning, with a decision adapter transmitting VLM decisions to E2E policy in the form of implicit embeddings. In the second stage, we align the VLM and the E2E policy in an open-loop setting. In the third stage, we perform closed-loop alignment via bottom-up Hierarchical Reinforcement Learning in 3DGS environments to reinforce the safety and efficiency. Extensive experiments demonstrate that Senna-2 achieves superior dual-system consistency (19.3% F1 score improvement) and significantly enhances driving safety in both open-loop (5.7% FDE reduction) and closed-loop settings (30.6% AF-CR reduction).