Quadruped Parkour Learning: Sparsely Gated Mixture of Experts with Visual Input

arXiv cs.RO / 4/22/2026

📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper studies whether sparsely gated mixture-of-experts (MoE) architectures can improve vision-based robotic parkour compared with standard MLP control policies.
In experiments with a real Unitree Go2 quadruped, the MoE-based policy significantly outperformed an MLP baseline, achieving about double the successful trials when traversing large obstacles.
When keeping the active-parameter budget comparable, the MoE delivers better results; matching MoE performance with a standard MLP required scaling the MLP to the full MoE parameter count.
That MLP scaling led to a 14.3% increase in computation time, indicating a favorable performance–efficiency trade-off for sparsely gated MoE in this setting.
The work includes an anonymized codebase link to support replication and further experimentation.

Abstract

Robotic parkour provides a compelling benchmark for advancing locomotion over highly challenging terrain, including large discontinuities such as elevated steps. Recent approaches have demonstrated impressive capabilities, including dynamic climbing and jumping, but typically rely on sequential multilayer perceptron (MLP) architectures with densely activated layers. In contrast, sparsely gated mixture-of-experts (MoE) architectures have emerged in the large language model domain as an effective paradigm for improving scalability and performance by activating only a subset of parameters at inference time. In this work, we investigate the application of sparsely gated MoE architectures to vision-based robotic parkour. We compare control policies based on standard MLPs and MoE architectures under a controlled setting where the number of active parameters at inference time is matched. Experimental results on a real Unitree Go2 quadruped robot demonstrate clear performance gains, with the MoE policy achieving double the number of successful trials in traversing large obstacles compared to a standard MLP baseline. We further show that achieving comparable performance with a standard MLP requires scaling its parameter count to match that of the total MoE model, resulting in a 14.3\% increase in computation time. These results highlight that sparsely gated MoE architectures provide a favorable trade-off between performance and computational efficiency, enabling improved scaling of control policies for vision-based robotic parkour. An anonymized link to the codebase is https://osf.io/v2kqj/files/github?view_only=7977dee10c0a44769184498eaba72e44.