Simulating Infant First-Person Sensorimotor Experience via Motion Retargeting from Babies to Humanoids

arXiv cs.RO / 5/1/2026

💬 OpinionDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

The paper proposes a framework to simulate infants’ multimodal sensorimotor experiences by retargeting motion from baby videos to humanoid robots and simulators.
It reconstructs an infant’s full 3D body pose from a single video by extracting skeletal structure frame-by-frame, then maps that motion onto multiple developmental platforms (physical iCub and virtual pyCub, EMFANT, MIMo).
The retargeted replay generates simulated sensory streams such as proprioception, touch, and vision, enabling richer analysis than approaches that only match kinematics.
For the best-matching embodiment, the method reports sub-centimeter retargeting accuracy, supporting both developmental-science studies and improved automated behavior annotation.
The authors release code publicly, positioning the framework as a tool for robotics, developmental science, and potential early detection of neurodevelopmental disorders.

Abstract

Motion retargeting from humans to human-like artificial agents is becoming increasingly important as humanoid robots grow more capable. However, most existing approaches focus only on reproducing kinematics and ignore the rich sensorimotor experience associated with human movement. In this work, we present a framework for simulating the multimodal sensorimotor experiences of infants using physical and virtual humanoids. From a single video, our method reconstructs the infant's body configuration by extracting its skeletal structure and estimating the full 3D pose from each frame. Then we map the reconstructed motion onto several developmental platforms: the physical iCub robot and the virtual simulators pyCub, EMFANT and MIMo. Replaying the retargeted motions on these embodiments produces simulated multisensory streams including proprioception (joints and muscles), touch, and vision. For the best-matching embodiment, the retargeting achieves sub-centimeter accuracy and enables a rich multimodal analysis of infant development as well as enhanced automated annotation of behaviors. This framework provides a unique window into the infant's sensorimotor experience, offering new tools for robotics, developmental science, and early detection of neurodevelopmental disorders. The code is available at https://github.com/ctu-vras/motion-retargeting/.

Why Autonomous Coding Agents Keep Failing — And What Actually Works

Dev.to

Text-to-image is easy. Chaining LLMs to generate, critique, and iterate on images autonomously is a routing nightmare. AgentSwarms now supports Image generation playground and creative media workflows!

Reddit r/artificial

Automating FDA Compliance: AI for Specialty Food Producers

Dev.to

Mistral's new flagship Medium 3.5 folds chat, reasoning, and code into one model

THE DECODER

I hate this group but not literally

Reddit r/LocalLLaMA

Simulating Infant First-Person Sensorimotor Experience via Motion Retargeting from Babies to Humanoids

Key Points

Abstract

Related Articles

Why Autonomous Coding Agents Keep Failing — And What Actually Works

Text-to-image is easy. Chaining LLMs to generate, critique, and iterate on images autonomously is a routing nightmare. AgentSwarms now supports Image generation playground and creative media workflows!

Automating FDA Compliance: AI for Specialty Food Producers

Mistral's new flagship Medium 3.5 folds chat, reasoning, and code into one model

I hate this group but not literally

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer