RoSHI: A Versatile Robot-oriented Suit for Human Data In-the-Wild

arXiv cs.RO / 4/9/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • RoSHI is a new hybrid wearable system that combines sparse low-cost IMUs with Project Aria glasses to estimate a wearer’s full 3D body pose and shape in a global metric coordinate frame using egocentric perception.
  • The approach is designed to address common data-collection tradeoffs by leveraging IMUs for robustness to occlusion and fast motions while using egocentric SLAM to anchor long-horizon movement and stabilize upper-body pose.
  • The authors collected an “agile activities” dataset to evaluate RoSHI, reporting improved performance over egocentric baselines and comparable results to a state-of-the-art exocentric baseline (SAM3D).
  • They further show that the recorded motion data can be used for real-world humanoid policy learning, linking improved mocap to downstream robot learning.
  • The project provides accompanying videos/data via the official webpage for further research use and validation.

Abstract

Scaling up robot learning will likely require human data containing rich and long-horizon interactions in the wild. Existing approaches for collecting such data trade off portability, robustness to occlusion, and global consistency. We introduce RoSHI, a hybrid wearable that fuses low-cost sparse IMUs with the Project Aria glasses to estimate the full 3D pose and body shape of the wearer in a metric global coordinate frame from egocentric perception. This system is motivated by the complementarity of the two sensors: IMUs provide robustness to occlusions and high-speed motions, while egocentric SLAM anchors long-horizon motion and stabilizes upper body pose. We collect a dataset of agile activities to evaluate RoSHI. On this dataset, we generally outperform other egocentric baselines and perform comparably to a state-of-the-art exocentric baseline (SAM3D). Finally, we demonstrate that the motion data recorded from our system are suitable for real-world humanoid policy learning. For videos, data and more, visit the project webpage: https://roshi-mocap.github.io/