Tune to Learn: How Controller Gains Shape Robot Policy Learning
arXiv cs.RO / 4/6/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that when state-conditioned robot policies are used with position controllers, controller gains should be chosen based on how learnable the resulting closed-loop system is rather than only by target compliance or stiffness.
- It systematically studies how position controller gains affect behavior cloning, reinforcement learning from scratch, and sim-to-real transfer across multiple tasks and robot embodiments.
- The results show behavior cloning performs best under compliant and overdamped gain regimes, while reinforcement learning can work across gain regimes if hyperparameters are tuned appropriately.
- For sim-to-real transfer, both stiff and overdamped gain regimes can reduce transfer performance, indicating a tradeoff between learnability and real-world robustness.
- Overall, the optimal gain-setting strategy depends on the learning paradigm used, not solely on the desired low-level control characteristics.




