Do We Really Need Immediate Resets? Rethinking Collision Handling for Efficient Robot Navigation

arXiv cs.RO / 5/5/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The article argues that in many deep reinforcement learning robot navigation setups, a single collision during training triggers an immediate global environment reset, which wastes potentially useful experience.
It proposes a Multi-Collision reset Budget (MCB) framework that separates local collision handling from full episode resets, letting an agent retry difficult obstacle configurations within the same episode.
Experiments across simulated and real-world robotic platforms show MCB accelerates early exploration and improves navigation success rate and efficiency compared with single-collision reset baselines.
The results indicate that using a small collision budget yields the biggest gains, balancing learning benefits with limited retries.

Abstract

Should a single collision necessarily terminate an entire navigation episode? In most deep reinforcement learning (DRL) frameworks for robot navigation, this remains the standard practice: every collision immediately triggers a global environment reset and is penalized as a complete task failure. While a collision during deployment naturally indicates task failure, applying the same treatment during training prevents the agent from exploring challenging obstacle configurations, which slows learning progress in the early training phase. In this work, we challenge this convention and propose a Multi-Collision reset Budget (MCB) framework that decouples local collision termination from global environment resets, allowing the agent to retry difficult configurations within the same episode. Experiments on multiple simulated and real-world robotic platforms show that the framework accelerates early-stage exploration and improves both success rate and navigation efficiency over conventional single-collision reset baselines, with a small collision budget producing the largest gains.