UMI-Underwater: Learning Underwater Manipulation without Underwater Teleoperation

arXiv cs.RO / 3/31/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The UMI-Underwater system addresses challenging underwater robotic grasping by combining autonomous self-supervised data collection with learning strategies that reduce the need for diverse underwater demonstrations.
  • It transfers grasp knowledge from on-land human handheld demos to underwater using a depth-based affordance representation designed to bridge the on-land-to-underwater domain gap and remain robust to lighting and color shifts.
  • An affordance model trained on on-land data is deployed underwater in a zero-shot manner via geometric alignment before training a diffusion-based control policy conditioned on affordances.
  • Pool experiments show improved grasp performance and robustness to background changes, along with better generalization to objects only seen in on-land data compared with RGB-only baselines.
  • The work provides code, videos, and additional results publicly via its project website.

Abstract

Underwater robotic grasping is difficult due to degraded, highly variable imagery and the expense of collecting diverse underwater demonstrations. We introduce a system that (i) autonomously collects successful underwater grasp demonstrations via a self-supervised data collection pipeline and (ii) transfers grasp knowledge from on-land human demonstrations through a depth-based affordance representation that bridges the on-land-to-underwater domain gap and is robust to lighting and color shift. An affordance model trained on on-land handheld demonstrations is deployed underwater zero-shot via geometric alignment, and an affordance-conditioned diffusion policy is then trained on underwater demonstrations to generate control actions. In pool experiments, our approach improves grasping performance and robustness to background shifts, and enables generalization to objects seen only in on-land data, outperforming RGB-only baselines. Code, videos, and additional results are available at https://umi-under-water.github.io.