MRS: Multi-Resolution Skills for HRL Agents

arXiv cs.RO / 4/22/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper identifies a key reason hierarchical reinforcement learning (HRL) can underperform on tasks requiring agility: subgoal goal representations are learned without constraints on reachability or temporal distance from the current state.
  • It shows that the best subgoal distance is state- and task-dependent, with nearby subgoals improving local control but increasing prediction noise, while distant subgoals smooth motion but reduce geometric precision.
  • The authors introduce Multi-Resolution Skills (MRS), which trains multiple goal-prediction modules specialized to different fixed temporal horizons and uses a meta-controller to select the appropriate module at each state.
  • Experiments demonstrate that MRS beats fixed-resolution HRL baselines and narrows the performance gap versus non-HRL state-of-the-art methods on DeepMind Control Suite, Gym-Robotics, and long-horizon AntMaze.
  • The work suggests that explicitly modeling temporal horizons in goal prediction can improve HRL’s ability to handle long-horizon planning while maintaining local agility.

Abstract

Hierarchical reinforcement learning (HRL) decomposes the policy into a manager and a worker, enabling long-horizon planning but introducing a performance gap on tasks requiring agility. We identify a root cause: in subgoal-based HRL, the manager's goal representation is typically learned without constraints on reachability or temporal distance from the current state, preventing precise local subgoal selection. We further show that the optimal subgoal distance is both task- and state-dependent: nearby subgoals enable precise control but amplify prediction noise, while distant subgoals produce smoother motion at the cost of geometric precision. We propose Multi-Resolution Skills (MRS), which learns multiple goal-prediction modules each specialized to a fixed temporal horizon, with a jointly trained meta-controller that selects among them based on the current state. MRS consistently outperforms fixed-resolution baselines and significantly reduces the performance gap between HRL and non-HRL state-of-the-art on DeepMind Control Suite, Gym-Robotics, and long-horizon AntMaze tasks. [Project page: https://sites.google.com/view/multi-res-skills/home]