Vision-Based Hand Shadowing for Robotic Manipulation via Inverse Kinematics
arXiv cs.AI / 3/13/2026
💬 OpinionTools & Practical UsageModels & Research
Key Points
- The paper presents an offline hand-shadowing and retargeting pipeline that uses a single egocentric RGB-D camera on 3D-printed glasses to control a 6-DOF robot via inverse kinematics in PyBullet.
- It detects 21 hand landmarks per hand with MediaPipe Hands, reconstructs 3D hand pose, transforms it into the robot frame, and solves a damped-least-squares IK problem to generate joint commands for the SO-ARM101.
- A gripper controller maps thumb-index geometry to grasp aperture using a four-level fallback, with actions previewed in a physics simulation before replay on the physical robot through the LeRobot framework.
- In evaluation, the structured pick-and-place benchmark achieves 90% success, while real-world unstructured environments with occlusion reduce success to 9.3%, illustrating both promise and current limitations of marker-free analytical retargeting.
- The work highlights the potential of vision-based retargeting for teleoperation while underscoring challenges like occlusion and environment clutter in achieving robust performance.
Related Articles
Two bots, one confused server: what Nimbus revealed about AI agent identity
Dev.to
How to Create a Month of Content in One Day Using AI (Step-by-Step System)
Dev.to

OpenTelemetry just standardized LLM tracing. Here's what it actually looks like in code.
Dev.to
🌱 How AI is Transforming Planting — and Why It Matters
Dev.to

What is MCP?
Dev.to