Adapt as You Say: Online Interactive Bimanual Skill Adaptation via Human Language Feedback

arXiv cs.RO / 3/30/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces BiSAIL, a framework for zero-shot online adaptation of offline-learned bimanual robot skills using human language feedback during deployment.
  • BiSAIL uses a hierarchical “reason-then-modulate” approach: it infers adaptation objectives from multimodal task variations, then applies diffusion-based motion modulation to meet those objectives.
  • Experiments with real robots across six bimanual tasks and two dual-arm platforms show improved performance over existing methods in human-in-the-loop adaptability, task generalization, and cross-embodiment scaling.
  • The authors claim the method supports adaptive bimanual assistants that non-experts can flexibly customize through intuitive verbal corrections, and they provide videos and code publicly.

Abstract

Developing general-purpose robots capable of autonomously operating in human living environments requires the ability to adapt to continuously evolving task conditions. However, adapting high-dimensional coordinated bimanual skills to novel task variations at deployment remains a fundamental challenge. In this work, we present BiSAIL (Bimanual Skill Adaptation via Interactive Language), a novel framework that enables zero-shot online adaptation of offline-learned bimanual skills through interactive language feedback. The key idea of BiSAIL is to adopt a hierarchical reason-then-modulate paradigm, which first infers generalized adaptation objectives from multimodal task variations, and then adapts bimanual motions via diffusion modulation to achieve the inferred objectives. Extensive real-robot experiments across six bimanual tasks and two dual-arm platforms demonstrate that BiSAIL significantly outperforms existing methods in human-in-the-loop adaptability, task generalization and cross-embodiment scalability. This work enables the development of adaptive bimanual assistants that can be flexibly customized by non-expert users via intuitive verbal corrections. Experimental videos and code are available at https://rip4kobe.github.io/BiSAIL/.