Rodrigues Network for Learning Robot Actions

arXiv cs.RO / 4/23/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that robot learning models for articulated actions should incorporate inductive biases reflecting the systems’ underlying kinematics rather than relying solely on generic architectures like MLPs or Transformers.
It introduces the Neural Rodrigues Operator as a learnable extension of classical forward kinematics, intended to inject kinematics-aware structure into neural computation.
Building on this operator, the authors propose the Rodrigues Network (RodriNet), a new action-focused neural architecture.
Experiments show that RodriNet improves performance on synthetic kinematics and motion prediction tasks and also works effectively in realistic settings, including diffusion-policy imitation learning and single-image 3D hand reconstruction.
Overall, the results indicate that structured kinematic priors in the network architecture can enhance learning of robotic actions across multiple domains.

Abstract

Understanding and predicting articulated actions is important in robot learning. However, common architectures such as MLPs and Transformers lack inductive biases that reflect the underlying kinematic structure of articulated systems. To this end, we propose the Neural Rodrigues Operator, a learnable generalization of the classical forward kinematics operation, designed to inject kinematics-aware inductive bias into neural computation. Building on this operator, we design the Rodrigues Network (RodriNet), a novel neural architecture specialized for processing actions. We evaluate the expressivity of our network on two synthetic tasks on kinematic and motion prediction, showing significant improvements compared to standard backbones. We further demonstrate its effectiveness in two realistic applications: (i) imitation learning on robotic benchmarks with the Diffusion Policy, and (ii) single-image 3D hand reconstruction. Our results suggest that integrating structured kinematic priors into the network architecture improves action learning in various domains.