BifrostUMI: Bridging Robot-Free Demonstrations and Humanoid Whole-Body Manipulation

arXiv cs.RO / 5/6/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

BifrostUMI is a new, robot-free data collection framework for training humanoid whole-body visuomotor policies, aiming to avoid the bottlenecks of robot teleoperation.
It uses lightweight VR to record human demonstrations as sparse keypoint trajectories while also capturing wrist-mounted visual data, producing multimodal training datasets.
The system trains a high-level policy network to predict future keypoint trajectories from the visual features, then retargets those trajectories onto the humanoid robot’s body morphology.
A keypoint retargeting pipeline and whole-body controller enable precise execution of agile behaviors learned from natural human demonstrations.
The authors report successful results in two different experimental scenarios, highlighting both effectiveness and versatility.

Abstract

High-quality data collection is a fundamental cornerstone for training humanoid whole-body visuomotor policies. Current data acquisition paradigms predominantly rely on robot teleoperation, which is often hindered by limited hardware accessibility and low operational efficiency. Inspired by the Universal Manipulation Interface (UMI), we propose BifrostUMI, a portable, efficient, and robot-free data collection framework tailored for humanoid robots. BifrostUMI leverages lightweight VR devices to capture human demonstrations as sparse keypoint trajectories while simultaneously recording wrist-mounted visual data. These multimodal data are subsequently utilized to train a high-level policy network that predicts future keypoint trajectories conditioned on the captured visual features. Through a robust keypoint retargeting pipeline, keypoint trajectories are precisely mapped onto the robot's morphology and executed via a whole-body controller. This approach enables the seamless transfer of diverse and agile behaviors from natural human demonstrations to humanoid embodiments. We demonstrate the efficacy and versatility of the proposed framework across two distinct experimental scenarios.

Black Hat USA

AI Business

Top 10 Free AI Tools for Students in 2026: The Ultimate Study Guide

Dev.to

PaioClaw Review: What You Actually Get for $15/mo vs DIY OpenClaw

Dev.to

PaioClaw Review: What You Actually Get for $15/mo vs DIY OpenClaw

Dev.to

SIFS (SIFS Is Fast Search) - local code search for coding agents

Dev.to

BifrostUMI: Bridging Robot-Free Demonstrations and Humanoid Whole-Body Manipulation

Key Points

Abstract

Related Articles

Black Hat USA

Top 10 Free AI Tools for Students in 2026: The Ultimate Study Guide

PaioClaw Review: What You Actually Get for $15/mo vs DIY OpenClaw

PaioClaw Review: What You Actually Get for $15/mo vs DIY OpenClaw

SIFS (SIFS Is Fast Search) - local code search for coding agents

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer