Toward Reliable Sim-to-Real Predictability for MoE-based Robust Quadrupedal Locomotion

arXiv cs.RO / 3/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a unified sim-to-real reliability approach for MoE-based robust quadrupedal locomotion that targets failures caused by sim-to-real gaps and reward overfitting in complex terrains.
  • It introduces an MoE locomotion policy with a gated set of specialist experts that decomposes latent terrain and command modeling, enabling robust generalization using proprioception-only sensing.
  • The framework includes RoboGauge, a predictive assessment suite that quantifies sim-to-real transferability using multi-dimensional proprioception-based metrics derived from sim-to-sim tests.
  • Experiments on a Unitree Go2 show successful deployment on previously unseen challenging terrains such as snow, sand, stairs, slopes, and 30 cm obstacles.
  • High-speed testing reports up to 4 m/s performance and an emergent narrow-width gait linked to improved stability at higher velocity.

Abstract

Reinforcement learning has shown strong promise for quadrupedal agile locomotion, even with proprioception-only sensing. In practice, however, sim-to-real gap and reward overfitting in complex terrains can produce policies that fail to transfer, while physical validation remains risky and inefficient. To address these challenges, we introduce a unified framework encompassing a Mixture-of-Experts (MoE) locomotion policy for robust multi-terrain representation with RoboGauge, a predictive assessment suite that quantifies sim-to-real transferability. The MoE policy employs a gated set of specialist experts to decompose latent terrain and command modeling, achieving superior deployment robustness and generalization via proprioception alone. RoboGauge further provides multi-dimensional proprioception-based metrics via sim-to-sim tests over terrains, difficulty levels, and domain randomizations, enabling reliable MoE policy selection without extensive physical trials. Experiments on a Unitree Go2 demonstrate robust locomotion on unseen challenging terrains, including snow, sand, stairs, slopes, and 30 cm obstacles. In dedicated high-speed tests, the robot reaches 4 m/s and exhibits an emergent narrow-width gait associated with improved stability at high velocity.
広告