Learning Energy-Efficient Air--Ground Actuation for Hybrid Robots on Stair-Like Terrain

arXiv cs.AI / 3/31/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper addresses a core limitation of hybrid aerial–ground robots on stair-like terrain, where wheels alone stall at edges and pure flight is inefficient for small height gains.
It proposes an energy-aware reinforcement learning framework that learns a single continuous control policy coordinating propellers, wheels, and tilt servos without switching between predefined “aerial” and “ground” modes.
Training uses proprioception plus a local height scan in Isaac Lab with parallel environments, and relies on hardware-calibrated thrust/power models so the reward function penalizes real electrical energy rather than proxy metrics.
Simulation results show about a 4× energy reduction versus propeller-only control, and the learned policy transfers to a DoubleBee prototype achieving 38% lower average power than a rule-based decoupled controller on an 8cm gap-climbing task.
Overall, the work demonstrates that energy-efficient hybrid actuation can emerge through learning and be deployed on real hardware.

Abstract

Hybrid aerial--ground robots offer both traversability and endurance, but stair-like discontinuities create a trade-off: wheels alone often stall at edges, while flight is energy-hungry for small height gains. We propose an energy-aware reinforcement learning framework that trains a single continuous policy to coordinate propellers, wheels, and tilt servos without predefined aerial and ground modes. We train policies from proprioception and a local height scan in Isaac Lab with parallel environments, using hardware-calibrated thrust/power models so the reward penalizes true electrical energy. The learned policy discovers thrust-assisted driving that blends aerial thrust and ground traction. In simulation it achieves about 4 times lower energy than propeller-only control. We transfer the policy to a DoubleBee prototype on an 8cm gap-climbing task; it achieves 38% lower average power than a rule-based decoupled controller. These results show that efficient hybrid actuation can emerge from learning and deploy on hardware.