Bi-HIL: Bilateral Control-Based Multimodal Hierarchical Imitation Learning via Subtask-Level Progress Rate and Keyframe Memory for Long-Horizon Contact-Rich Robotic Manipulation

arXiv cs.RO / 3/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes Bi-HIL, a bilateral control-based multimodal hierarchical imitation learning framework aimed at stabilizing long-horizon, contact-rich robotic manipulation under partial observability and contact uncertainty.
  • Bi-HIL improves hierarchical coordination by adding keyframe memory and a subtask-level progress rate that explicitly models phase progression within the active subtask.
  • The method conditions both high-level and low-level policies, combining hierarchical temporal reasoning with force-aware control to better handle unstable subtask transitions.
  • Experiments on unimanual and bimanual real-robot tasks show consistent gains over flat-policy baselines and several ablated variants.
  • Overall, the results emphasize that explicitly tracking subtask progression—alongside force-aware bilateral control—is important for robust long-horizon manipulation.

Abstract

Long-horizon contact-rich robotic manipulation remains challenging due to partial observability and unstable subtask transitions under contact uncertainty. While hierarchical architectures improve temporal reasoning and bilateral imitation learning enables force-aware control, existing approaches often rely on flat policies that struggle with long-horizon coordination. We propose Bi-HIL, a bilateral control-based multimodal hierarchical imitation learning framework for long-horizon manipulation. Bi-HIL stabilizes hierarchical coordination by integrating keyframe memory with subtask-level progress rate that models phase progression within the active subtask and conditions both high- and low-level policies. We evaluate Bi-HIL on unimanual and bimanual real-robot tasks, demonstrating consistent improvements over flat and ablated variants. The results highlight the importance of explicitly modeling subtask progression together with force-aware control for robust long-horizon manipulation. For additional material, please check: https://mertcookimg.github.io/bi-hil