Decompose and Recompose: Reasoning New Skills from Existing Abilities for Cross-Task Robotic Manipulation

arXiv cs.RO / 5/5/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper targets open-world robotic manipulation by improving cross-task generalization through extracting transferable manipulation skills from previously seen tasks.
It argues that existing in-context learning methods mainly provide continuous action trajectories, which leads to superficial imitation rather than composable skill transfer.
The proposed “Decompose and Recompose” framework represents knowledge as atomic skill–action pairs, first decomposing demonstrations into interpretable skill-action alignments and then recomposing them for unseen tasks via compositional reasoning.
It builds a task-adaptive dynamic demonstration library using visual-semantic retrieval plus skill sequences from a planning agent, and uses a coverage-aware static library to supply missing skill patterns.
Experiments on the AGNOSTOS benchmark and real-world setups reportedly confirm strong zero-shot cross-task generalization results.

Abstract

Cross-task generalization is a core challenge in open-world robotic manipulation, and the key lies in extracting transferable manipulation knowledge from seen tasks. Recent in-context learning approaches leverage seen task demonstrations to generate actions for unseen tasks without parameter updates. However, existing methods provide only low-level continuous action sequences as context, failing to capture composable skill knowledge and causing models to degenerate into superficial trajectory imitation. We propose Decompose and Recompose, a skill reasoning framework using atomic skill-action pairs as intermediate representations. Our approach decomposes seen demonstrations into interpretable skill--action alignments, enabling the model to recompose these skills for unseen tasks through compositional reasoning. Specifically, we construct a task-adaptive dynamic demonstration library via visual-semantic retrieval combined with skill sequences from a planning agent, complemented by a coverage-aware static library to fill missing skill patterns. Together, these yield skill-comprehensive demonstrations that explicitly elicit compositional reasoning for skill composition and execution ordering. Experiments on the AGNOSTOS benchmark and real-world environments validate our method's zero-shot cross-task generalization capability.

Singapore's Fraud Frontier: Why AI Scam Detection Demands Regulatory Precision

Dev.to

How AI is Changing the Way We Code in 2026: The Shift from Syntax to Strategy

Dev.to

13 CLAUDE.md Rules That Make AI Write Modern PHP (Not PHP 5 Resurrected)

Dev.to

MCP annotations are a UX layer, not a security layer

Dev.to

From OOM to 262K Context: Running Qwen3-Coder 30B Locally on 8GB VRAM

Dev.to

Decompose and Recompose: Reasoning New Skills from Existing Abilities for Cross-Task Robotic Manipulation

Key Points

Abstract

Related Articles

Singapore's Fraud Frontier: Why AI Scam Detection Demands Regulatory Precision

How AI is Changing the Way We Code in 2026: The Shift from Syntax to Strategy

13 CLAUDE.md Rules That Make AI Write Modern PHP (Not PHP 5 Resurrected)

MCP annotations are a UX layer, not a security layer

From OOM to 262K Context: Running Qwen3-Coder 30B Locally on 8GB VRAM

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer