Towards Viewpoint-Robust End-to-End Autonomous Driving with 3D Foundation Model Priors

arXiv cs.CV / 4/2/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper addresses a key limitation in end-to-end autonomous driving: many existing trajectory-planning models degrade when camera viewpoint changes from the training distribution.
It proposes an augmentation-free technique that uses geometric priors from a 3D foundation model by injecting per-pixel 3D positions (from depth estimates) as positional embeddings and fusing geometric intermediate features via cross-attention.
Experiments on the VR-Drive benchmark (camera viewpoint perturbations) show reduced performance drop across most perturbation types.
The approach yields the most clear improvements for pitch and height perturbations, while robustness gains for longitudinal translation are smaller, indicating a need for more viewpoint-agnostic integration.
Overall, the work suggests that incorporating 3D geometric priors into end-to-end pipelines can improve viewpoint robustness without relying on additional data augmentation.

Abstract

Robust trajectory planning under camera viewpoint changes is important for scalable end-to-end autonomous driving. However, existing models often depend heavily on the camera viewpoints seen during training. We investigate an augmentation-free approach that leverages geometric priors from a 3D foundation model. The method injects per-pixel 3D positions derived from depth estimates as positional embeddings and fuses intermediate geometric features through cross-attention. Experiments on the VR-Drive camera viewpoint perturbation benchmark show reduced performance degradation under most perturbation conditions, with clear improvements under pitch and height perturbations. Gains under longitudinal translation are smaller, suggesting that more viewpoint-agnostic integration is needed for robustness to camera viewpoint changes.

Black Hat Asia

AI Business

Self-Hosted AI in 2026: Automating Your Linux Workflow with n8n and Ollama

Dev.to

How SentinelOne’s AI EDR Autonomously Discovered and Stopped Anthropic’s Claude from Executing a Zero Day Supply Chain Attack, Globally

Dev.to

Why the same codebase should always produce the same audit score

Dev.to

Agent Diary: Apr 2, 2026 - The Day I Became a Self-Sustaining Clockwork Poet (While Workflow 228 Takes the Stage)

Dev.to

Towards Viewpoint-Robust End-to-End Autonomous Driving with 3D Foundation Model Priors

Key Points

Abstract

Related Articles

Black Hat Asia

Self-Hosted AI in 2026: Automating Your Linux Workflow with n8n and Ollama

How SentinelOne’s AI EDR Autonomously Discovered and Stopped Anthropic’s Claude from Executing a Zero Day Supply Chain Attack, Globally

Why the same codebase should always produce the same audit score

Agent Diary: Apr 2, 2026 - The Day I Became a Self-Sustaining Clockwork Poet (While Workflow 228 Takes the Stage)

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer