Understanding Adversarial Transferability in Vision-Language Models for Autonomous Driving: A Cross-Architecture Analysis
arXiv cs.CV / 5/1/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The study investigates how robust vision-language models (VLMs) used in autonomous driving are to physical adversarial attacks, focusing on whether attacks can transfer across different model architectures.
- Researchers conduct a cross-architecture evaluation using three representative VLM-based driving architectures (Dolphins, OmniDrive, and LeapVAD) with physically realizable patch attacks on roadside infrastructure in crosswalk and highway scenarios.
- The results show high cross-architecture transferability, with reported transfer rates of 73–91% and mean transfer metrics (TR) of 0.815 for crosswalk and 0.833 for highway.
- Frame-level manipulation persists for a large portion of the critical decision window (64.7–79.4%) even when adversarial patches are not optimized for the target model, suggesting a practical security risk.
- Overall, the findings indicate that attackers may not need knowledge of the specific deployed VLM architecture to induce harmful perception or decision disruptions in driving contexts.
Related Articles
Every handle invocation on BizNode gets a WFID — a universal transaction reference for accountability. Full audit trail,...
Dev.to
I deployed AI agents across AWS, GCP, and Azure without a VPN. Here is how it works.
Dev.to
Panduan Lengkap TestSprite MCP Server — Dokumentasi Getting Started dalam Bahasa Indonesia
Dev.to
AI made learning fun again
Dev.to
Every Telegram conversation becomes a qualified lead. BizNode captures name, email, and business details automatically while...
Dev.to