Hierarchical DLO Routing with Reinforcement Learning and In-Context Vision-language Models

arXiv cs.RO / 4/16/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper presents a fully autonomous hierarchical framework for long-horizon routing of deformable linear objects (e.g., cables and ropes), which require long-term planning and reliable multi-skill execution.
It converts language-specified routing goals into high-level plans using vision-language models for in-context reasoning, then relies on reinforcement learning to execute low-level manipulation skills.
To handle robustness over long horizons, the method includes a failure recovery mechanism that reorients the DLO into insertion-feasible states when errors occur.
The approach is reported to generalize across diverse scenes and command styles (including implicit language and spatial descriptions) and achieves a 92% overall success rate on long-horizon routing scenarios.
The work is accompanied by a project page and described as an arXiv update, positioning it as an applied research contribution for robot manipulation of deformable objects.

Abstract

Long-horizon routing tasks of deformable linear objects (DLOs), such as cables and ropes, are common in industrial assembly lines and everyday life. These tasks are particularly challenging because they require robots to manipulate DLO with long-horizon planning and reliable skill execution. Successfully completing such tasks demands adapting to their nonlinear dynamics, decomposing abstract routing goals, and generating multi-step plans composed of multiple skills, all of which require accurate high-level reasoning during execution. In this paper, we propose a fully autonomous hierarchical framework for solving challenging DLO routing tasks. Given an implicit or explicit routing goal expressed in language, our framework leverages vision-language models~(VLMs) for in-context high-level reasoning to synthesize feasible plans, which are then executed by low-level skills trained via reinforcement learning. To improve robustness over long horizons, we further introduce a failure recovery mechanism that reorients the DLO into insertion-feasible states. Our approach generalizes to diverse scenes involving object attributes, spatial descriptions, implicit language commands, and \myred{extended 5-clip settings}. It achieves an overall success rate of 92\% across long-horizon routing scenarios. Please refer to our project page: https://icra2026-dloroute.github.io/DLORoute/

Introducing Claude Opus 4.7

Anthropic News

Who Audits the Auditors? Building an LLM-as-a-Judge for Agentic Reliability

Dev.to

"Enterprise AI Cost Optimization: How Companies Are Cutting AI Infrastructure Sp

Dev.to

Config-first code generator to replace repetitive AI boilerplate — looking for feedback and collaborators

Dev.to

The US Government Fired 40% of an Agency, Then Asked AI to Do Their Jobs

Dev.to

Hierarchical DLO Routing with Reinforcement Learning and In-Context Vision-language Models

Key Points

Abstract

Related Articles

Introducing Claude Opus 4.7

Who Audits the Auditors? Building an LLM-as-a-Judge for Agentic Reliability

"Enterprise AI Cost Optimization: How Companies Are Cutting AI Infrastructure Sp

Config-first code generator to replace repetitive AI boilerplate — looking for feedback and collaborators

The US Government Fired 40% of an Agency, Then Asked AI to Do Their Jobs

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer