Vision-Based Hand Shadowing for Robotic Manipulation via Inverse Kinematics
arXiv cs.AI / 3/13/2026
💬 OpinionTools & Practical UsageModels & Research
Key Points
- The paper presents an offline hand-shadowing and retargeting pipeline that uses a single egocentric RGB-D camera on 3D-printed glasses to control a 6-DOF robot via inverse kinematics in PyBullet.
- It detects 21 hand landmarks per hand with MediaPipe Hands, reconstructs 3D hand pose, transforms it into the robot frame, and solves a damped-least-squares IK problem to generate joint commands for the SO-ARM101.
- A gripper controller maps thumb-index geometry to grasp aperture using a four-level fallback, with actions previewed in a physics simulation before replay on the physical robot through the LeRobot framework.
- In evaluation, the structured pick-and-place benchmark achieves 90% success, while real-world unstructured environments with occlusion reduce success to 9.3%, illustrating both promise and current limitations of marker-free analytical retargeting.
- The work highlights the potential of vision-based retargeting for teleoperation while underscoring challenges like occlusion and environment clutter in achieving robust performance.
Related Articles

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成
日経XTECH

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO
Dev.to

Why Regex is Not Enough: Building a Deterministic "Sudo" Layer for AI Agents
Dev.to

Perplexity Hub
Dev.to

How to Build Passive Income with AI in 2026: A Developer's Practical Guide
Dev.to