AI Navigate

CausalVAD: De-confounding End-to-End Autonomous Driving via Causal Intervention

arXiv cs.CV / 3/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper identifies that planning-oriented end-to-end driving models often learn correlations instead of causal relationships, making them vulnerable to dataset biases.
  • It introduces CausalVAD, a de-confounding training framework that uses causal intervention to remove spurious associations from representations.
  • Central to CausalVAD is the sparse causal intervention scheme (SCIS), a plug-and-play module that implements backdoor adjustment by building a dictionary of latent driving-context prototypes and intervening on the model's sparse queries.
  • Experiments on nuScenes show CausalVAD achieves state-of-the-art planning accuracy and safety, with improved robustness against data bias and noisy scenarios designed to induce causal confusion.

Abstract

Planning-oriented end-to-end driving models show great promise, yet they fundamentally learn statistical correlations instead of true causal relationships. This vulnerability leads to causal confusion, where models exploit dataset biases as shortcuts, critically harming their reliability and safety in complex scenarios. To address this, we introduce CausalVAD, a de-confounding training framework that leverages causal intervention. At its core, we design the sparse causal intervention scheme (SCIS), a lightweight, plug-and-play module to instantiate the backdoor adjustment theory in neural networks. SCIS constructs a dictionary of prototypes representing latent driving contexts. It then uses this dictionary to intervene on the model's sparse vectorized queries. This step actively eliminates spurious associations induced by confounders, thereby eliminating spurious factors from the representations for downstream tasks. Extensive experiments on benchmarks like nuScenes show CausalVAD achieves state-of-the-art planning accuracy and safety. Furthermore, our method demonstrates superior robustness against both data bias and noisy scenarios configured to induce causal confusion.