Tex3D: Objects as Attack Surfaces via Adversarial 3D Textures for Vision-Language-Action Models
arXiv cs.CV / 4/3/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces Tex3D, a framework for end-to-end optimization of physically realizable adversarial 3D textures that attack vision-language-action (VLA) robotic manipulation models through object appearance.
- It identifies a key technical challenge: standard 3D simulators are not differentiable from the VLA objective back to object textures, preventing straightforward end-to-end optimization.
- Tex3D addresses this with Foreground-Background Decoupling (FBD), using dual-renderer alignment to enable differentiable texture optimization while keeping the original simulation environment.
- To maintain attack effectiveness in real-world settings with long horizons and viewpoint changes, it proposes Trajectory-Aware Adversarial Optimization (TAAO), which focuses on behaviorally critical frames and stabilizes optimization via vertex-based parameterization.
- Experiments (simulation and real-robot) demonstrate substantial degradation of VLA performance, with task failure rates reportedly reaching up to 96.7%, indicating significant robustness vulnerabilities to physically grounded attacks.




