Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

arXiv cs.CV / 5/6/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper surveys safety risks in Embodied AI, where agents perceive, plan, act, and interact in open-world, safety-critical settings like transportation, healthcare, and robotics.
  • It highlights why embodied systems are uniquely dangerous compared with purely digital AI, due to uncertain sensing, incomplete knowledge, and dynamic human-robot interactions that can cause direct physical harm.
  • The authors propose a multi-level taxonomy covering attacks and defenses across the entire embodied pipeline, from perception and cognition to planning, action, and interaction.
  • Drawing on 400+ papers, the review synthesizes work on adversarial, backdoor, jailbreak, and hardware-level attacks, along with methods for detection, safe training, and robust inference.
  • It identifies key underexplored challenges, including fragility in multimodal perception fusion, planning instability under jailbreak attacks, and challenges in trustworthy interaction in open-ended scenarios.

Abstract

Embodied Artificial Intelligence (Embodied AI) integrates perception, cognition, planning, and interaction into agents that operate in open-world, safety-critical environments. As these systems gain autonomy and enter domains such as transportation, healthcare, and industrial or assistive robotics, ensuring their safety becomes both technically challenging and socially indispensable. Unlike digital AI systems, embodied agents must act under uncertain sensing, incomplete knowledge, and dynamic human-robot interactions, where failures can directly lead to physical harm. This survey provides a comprehensive and structured review of safety research in embodied AI, examining attacks and defenses across the full embodied pipeline, from perception and cognition to planning, action and interaction, and agentic system. We introduce a multi-level taxonomy that unifies fragmented lines of work and connects embodied-specific safety findings with broader advances in vision, language, and multimodal foundation models. Our review synthesizes insights from over 400 papers spanning adversarial, backdoor, jailbreak, and hardware-level attacks; attack detection, safe training and robust inference; and risk-aware human-agent interaction. This analysis reveals several overlooked challenges, including the fragility of multimodal perception fusion, the instability of planning under jailbreak attacks, and the trustworthiness of human-agent interaction in open-ended scenarios. By organizing the field into a coherent framework and identifying critical research gaps, this survey provides a roadmap for building embodied agents that are not only capable and autonomous but also safe, robust, and reliable in real-world deployment.

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses | AI Navigate