Bridging Discrete Planning and Continuous Execution for Redundant Robot

arXiv cs.RO / 4/3/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a common failure mode of voxel-grid reinforcement learning planners on 7-DoF redundant robot arms, where point-wise numerical inverse kinematics execution causes step-size jitter, abrupt joint transitions, and instability near singularities.
  • It introduces a “bridging” framework that leaves the discrete planner unchanged while stabilizing both the discrete action representation (step-normalized 26-neighbor Cartesian actions plus geometric tie-breaking) and the continuous execution layer.
  • For execution, it proposes a task-priority damped least-squares inverse kinematics (TP-DLS) approach that treats end-effector position as the primary task and projects posture and joint-centering objectives into the null space.
  • Using trust-region clipping and joint velocity constraints, the method substantially improves results on a 7-DoF manipulator, including raising dense-scene planning success from ~0.58 to 1.00, shortening path length from ~1.53 m to ~1.10 m, keeping end-effector error under 1 mm, and reducing peak joint accelerations by over an order of magnitude.
  • Overall, the work shows that careful coupling of discrete planning outputs with constrained, priority-based continuous IK can significantly improve the real-world smoothness and stability of RL-generated robot trajectories.

Abstract

Voxel-grid reinforcement learning is widely adopted for path planning in redundant manipulators due to its simplicity and reproducibility. However, direct execution through point-wise numerical inverse kinematics on 7-DoF arms often yields step-size jitter, abrupt joint transitions, and instability near singular configurations. This work proposes a bridging framework between discrete planning and continuous execution without modifying the discrete planner itself. On the planning side, step-normalized 26-neighbor Cartesian actions and a geometric tie-breaking mechanism are introduced to suppress unnecessary turns and eliminate step-size oscillations. On the execution side, a task-priority damped least-squares (TP-DLS) inverse kinematics layer is implemented. This layer treats end-effector position as a primary task, while posture and joint centering are handled as subordinate tasks projected into the null space, combined with trust-region clipping and joint velocity constraints. On a 7-DoF manipulator in random sparse, medium, and dense environments, this bridge raises planning success in dense scenes from about 0.58 to 1.00, shortens representative path length from roughly 1.53 m to 1.10 m, and while keeping end-effector error below 1 mm, reduces peak joint accelerations by over an order of magnitude, substantially improving the continuous execution quality of voxel-based RL paths on redundant manipulators.