Bi-HIL: Bilateral Control-Based Multimodal Hierarchical Imitation Learning via Subtask-Level Progress Rate and Keyframe Memory for Long-Horizon Contact-Rich Robotic Manipulation

arXiv cs.RO / 3/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes Bi-HIL, a bilateral control-based multimodal hierarchical imitation learning framework aimed at stabilizing long-horizon, contact-rich robotic manipulation under partial observability and contact uncertainty.
Bi-HIL improves hierarchical coordination by adding keyframe memory and a subtask-level progress rate that explicitly models phase progression within the active subtask.
The method conditions both high-level and low-level policies, combining hierarchical temporal reasoning with force-aware control to better handle unstable subtask transitions.
Experiments on unimanual and bimanual real-robot tasks show consistent gains over flat-policy baselines and several ablated variants.
Overall, the results emphasize that explicitly tracking subtask progression—alongside force-aware bilateral control—is important for robust long-horizon manipulation.

Abstract

Long-horizon contact-rich robotic manipulation remains challenging due to partial observability and unstable subtask transitions under contact uncertainty. While hierarchical architectures improve temporal reasoning and bilateral imitation learning enables force-aware control, existing approaches often rely on flat policies that struggle with long-horizon coordination. We propose Bi-HIL, a bilateral control-based multimodal hierarchical imitation learning framework for long-horizon manipulation. Bi-HIL stabilizes hierarchical coordination by integrating keyframe memory with subtask-level progress rate that models phase progression within the active subtask and conditions both high- and low-level policies. We evaluate Bi-HIL on unimanual and bimanual real-robot tasks, demonstrating consistent improvements over flat and ablated variants. The results highlight the importance of explicitly modeling subtask progression together with force-aware control for robust long-horizon manipulation. For additional material, please check: https://mertcookimg.github.io/bi-hil

Persistent memory changes how people interact with AI — here's what I'm observing

Reddit r/artificial

Does a 3D Environment Change How You Retain Information From AI?

Reddit r/artificial

HumanExodus: Why I'm Building Measurement Infrastructure for the Largest Labour Transition in History

Dev.to

How Open-Source AI Skills Are Revolutionizing Affiliate Marketing

Dev.to

Can AI Exit Vim?

Dev.to

Bi-HIL: Bilateral Control-Based Multimodal Hierarchical Imitation Learning via Subtask-Level Progress Rate and Keyframe Memory for Long-Horizon Contact-Rich Robotic Manipulation

Key Points

Abstract

Related Articles

Persistent memory changes how people interact with AI — here's what I'm observing

Does a 3D Environment Change How You Retain Information From AI?

HumanExodus: Why I'm Building Measurement Infrastructure for the Largest Labour Transition in History

How Open-Source AI Skills Are Revolutionizing Affiliate Marketing

Can AI Exit Vim?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer