Phase-Aware Policy Learning for Skateboard Riding of Quadruped Robots via Feature-wise Linear Modulation

arXiv cs.RO / 4/22/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper presents Phase-Aware Policy Learning (PAPL), a reinforcement-learning framework designed to control quadruped robots for skateboard riding despite phase-dependent dynamics and perception-driven interactions.
  • PAPL improves actor and critic networks by adding phase-conditioned Feature-wise Linear Modulation (FiLM) layers, using the cyclic nature of skateboarding to learn a single unified policy with phase-dependent behaviors.
  • The method shares knowledge across different skateboarding phases while remaining tailored to robot-specific characteristics, aiming to reduce policy fragmentation and improve generalization.
  • Simulation results show strong command-tracking performance and include ablation studies to measure the contribution of each component.
  • The authors also report comparisons with leg and wheel-leg baselines and demonstrate real-world transferability, indicating practical robustness beyond simulation.

Abstract

Skateboards offer a compact and efficient means of transportation as a type of personal mobility device. However, controlling them with legged robots poses several challenges for policy learning due to perception-driven interactions and multi-modal control objectives across distinct skateboarding phases. To address these challenges, we introduce Phase-Aware Policy Learning (PAPL), a reinforcement-learning framework tailored for skateboarding with quadruped robots. PAPL leverages the cyclic nature of skateboarding by integrating phase-conditioned Feature-wise Linear Modulation layers into actor and critic networks, enabling a unified policy that captures phase-dependent behaviors while sharing robot-specific knowledge across phases. Our evaluations in simulation validate command-tracking accuracy and conduct ablation studies quantifying each component's contribution. We also compare locomotion efficiency against leg and wheel-leg baselines and show real-world transferability.