SixthSense: Task-Agnostic Proprioception-Only Whole-Body Wrench Estimation for Humanoids

arXiv cs.RO / 5/5/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues humanoid robots need reliable contact perception to move beyond “toy-like” behaviors and support practical force-interaction tasks.
  • It introduces SixthSense, a task-agnostic method that estimates full-body contact timing, contact locations, and external wrenches using only proprioception and IMU data, avoiding hard-to-measure contact assumptions.
  • The approach uses conditional flow matching to model complex relationships between uncertain motion outputs and unstructured contact inputs by producing a sparse spatiotemporal contact-event flow.
  • SixthSense is positioned as a plug-and-play perception module for applications such as collision detection, physical human-robot interaction, and force-feedback teleoperation.
  • Experiments on standing, walking, and whole-body motion-tracking show performance claimed to be unprecedented across multiple behaviors.

Abstract

Humanoid robots are entering our physical world at scale, yet as oversized toys--good at singing and dancing, but short on force-interaction capabilities for practical tasks. Bridging this gap necessitates prioritizing reliable contact perception as a fundamental requirement. Estimating external wrenches in humanoids is complicated by floating-base dynamics and indeterminate contact locations. Existing analytical frameworks require idealistic assumptions and hard-to-obtain measurements, which are often unavailable in practice. To bridge this gap, we propose SixthSense, a task-agnostic approach that infers whole-body contact timing, location, and wrenches from proprioception and IMU data alone. To capture the multi-modal dynamics between unstructured contact inputs and the uncertain motion outputs, we employ conditional flow matching to tokenize proprioceptive histories and estimate a spatiotemporally sparse contact-event flow. SixthSense serves as a plug-and-play perception module for applications including collision detection, physical human-robot interaction, and force-feedback teleoperation. Experiments across standing, walking, and whole-body motion-tracking policies showcased unprecedented performance in diverse behaviors.