ShapeGrasp: Simultaneous Visuo-Haptic Shape Completion and Grasping for Improved Robot Manipulation

arXiv cs.RO / 5/5/2026

📰 NewsDeveloper Stack & InfrastructureIndustry & Market MovesModels & Research

Key Points

  • ShapeGrasp is a robotics approach that mimics human manipulation by iteratively combining vision with visuo-haptic shape completion and physics-based grasp planning.
  • Starting from a single RGB-D view, the method reconstructs a full 3D object shape, simulates candidate grasps, and selects the best feasible one for execution.
  • After each grasp attempt, the system fuses new geometric constraints from tactile contacts and the gripper’s occupied space to refine the object’s shape representation.
  • If a grasp fails, ShapeGrasp re-estimates the pose and retries grasping using the updated (refined) shape, enabling closed-loop correction.
  • Real-world experiments on two robot–gripper setups show improved grasp success rates (84% for a three-finger gripper and 91% for a two-finger gripper) and better 3D reconstruction quality versus baselines.

Abstract

Humans grasp unfamiliar objects by combining an initial visual estimate with tactile and proprioceptive feedback during interaction. We present ShapeGrasp, a robotic implementation of this approach. The proposed method is an iterative grasp-and-complete pipeline that couples implicit surface visuo-haptic shape completion (creation of full 3D shape from partial information) with physics-based grasp planning. From a single RGB-D view, ShapeGrasp infers a complete shape (point cloud or triangular mesh), generates candidate grasps via rigid-body simulation, and executes the best feasible grasp. Each grasp attempt yields additional geometric constraints -- tactile surface contacts and space occupied by the gripper body -- which are fused to update the object shape. Failures trigger pose re-estimation and regrasping using the refined shape. We evaluate ShapeGrasp in the real world using two different robots and grippers. To the best of our knowledge, this is the first approach that updates shape representations following a real-world grasp. We achieved superior results over baselines for both grippers (grasp success rate of 84% with a three-finger gripper and 91% with a two-finger gripper), while improving the 3D shape reconstruction quality in all evaluation metrics used.