Demystifying Action Space Design for Robotic Manipulation Policies

arXiv cs.RO / 4/24/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that action space design strongly influences imitation-based robotic manipulation policy learning by shaping the optimization landscape and affecting learning behavior.
It presents a large-scale empirical study that analyzes action design choices along temporal and spatial dimensions to clarify how they impact policy learnability and control stability.
Using 13,000+ real-world rollouts on a bimanual robot and evaluation across 500+ trained models in four scenarios, the authors quantify trade-offs among action representations.
Results indicate that having the policy consistently predict delta actions improves performance, while joint-space and task-space parameterizations provide complementary benefits for control stability versus generalization.
The work aims to move beyond ad-hoc or legacy action space heuristics by providing a more structured “design philosophy” for robotic policy construction.

Abstract

The specification of the action space plays a pivotal role in imitation-based robotic manipulation policy learning, fundamentally shaping the optimization landscape of policy learning. While recent advances have focused heavily on scaling training data and model capacity, the choice of action space remains guided by ad-hoc heuristics or legacy designs, leading to an ambiguous understanding of robotic policy design philosophies. To address this ambiguity, we conducted a large-scale and systematic empirical study, confirming that the action space does have significant and complex impacts on robotic policy learning. We dissect the action design space along temporal and spatial axes, facilitating a structured analysis of how these choices govern both policy learnability and control stability. Based on 13,000+ real-world rollouts on a bimanual robot and evaluation on 500+ trained models over four scenarios, we examine the trade-offs between absolute vs. delta representations, and joint-space vs. task-space parameterizations. Our large-scale results suggest that properly designing the policy to predict delta actions consistently improves performance, while joint-space and task-space representations offer complementary strengths, favoring control stability and generalization, respectively.