Learning Multi-Agent Local Collision-Avoidance for Collaborative Carrying tasks with Coupled Quadrupedal Robots

arXiv cs.RO / 3/25/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper targets collaborative carrying with multiple mechanically coupled quadrupedal robots, focusing on safe coordination in environments that include obstacles rather than only obstacle-free spaces.
It introduces an RL-based hierarchical control system that tracks a commanded velocity direction while avoiding collisions using only onboard sensing, removing the need for precomputed trajectories or full map knowledge.
A high-level object-centric policy selects actions that command two pretrained locomotion policies, enabling coordinated motion without centralized trajectory planning.
The method uses a game-inspired curriculum to progressively increase terrain and obstacle complexity during training.
Experiments on two coupled quadrupeds in unknown environments show improved performance against optimization-based and decentralized RL baselines, and demonstrate map- and path-planner-free locomotion.

Abstract

Robotic collaborative carrying could greatly benefit human activities like warehouse and construction site management. However, coordinating the simultaneous motion of multiple robots represents a significant challenge. Existing works primarily focus on obstacle-free environments, making them unsuitable for most real-world applications. Works that account for obstacles, either overfit to a specific terrain configuration or rely on pre-recorded maps combined with path planners to compute collision-free trajectories. This work focuses on two quadrupedal robots mechanically connected to a carried object. We propose a Reinforcement Learning (RL)-based policy that enables tracking a commanded velocity direction while avoiding collisions with nearby obstacles using only onboard sensing, eliminating the need for precomputed trajectories and complete map knowledge. Our work presents a hierarchical architecture, where a perceptive high-level object-centric policy commands two pretrained locomotion policies. Additionally, we employ a game-inspired curriculum to increase the complexity of obstacles in the terrain progressively. We validate our approach on two quadrupedal robots connected to a bar via spherical joints, benchmarking it against optimization-based and decentralized RL baselines. Our hardware experiments demonstrate the ability of our system to locomote in unknown environments without the need for a map or a path planner. The video of our work is available in the multimedia material.