Learning from Demonstration with Failure Awareness for Safe Robot Navigation

arXiv cs.RO / 4/28/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper tackles a safety gap in learning-from-demonstration robot navigation, where training data mainly cover successful behaviors and provide little information about unsafe states.
  • It argues that failure experiences (e.g., collisions) are informative about hazardous regions, but naïvely adding them to imitation/policy learning can worsen performance.
  • The authors propose a failure-aware learning framework that separates how success and failure data are used: failure data inform value estimation in dangerous regions, while policy learning uses only successful demonstrations.
  • Experiments in offline reinforcement learning settings, both in simulation and on real robots, show reduced collision rates without sacrificing task success and improved generalization across environments and robot platforms.

Abstract

Learning from demonstration is widely used for robot navigation, yet it suffers from a fundamental limitation: demonstrations consist predominantly of successful behaviors and provide limited coverage of unsafe states. This limitation leads to poor safety when the robot encounters scenarios beyond the demonstration distribution. Failure experiences, such as collisions, contain essential information about unsafe regions, but remain underutilized. The key difficulty lies in the fact that failure data do not provide valid guidance for action imitation, and their naive incorporation into policy learning often degrades performance. We address this challenge by proposing a failure-aware learning framework that explicitly decouples the roles of success and failure data. In this framework, failure experiences are used to shape value estimation in hazardous regions, while policy learning is restricted to successful demonstrations. This separation enables the effective use of failure data without corrupting policy behavior. We implement this design within an offline reinforcement learning (RL) setting and evaluate it in both simulation and real-world environments. The results show that our framework consistently reduces collision rates while preserving the task success rate, and demonstrate strong generalization across different environments and robot platforms.