ALAS: Adaptive Long-Horizon Action Synthesis via Async-pathway Stream Disentanglement

arXiv cs.RO / 4/23/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

ALAS is a cross-domain learning framework designed to improve performance on long-horizon human-scene interaction tasks that require continuous planning and extended execution across multiple environments.
The approach replaces brittle skill chaining (concatenating pre-trained subtasks) with a biologically inspired dual-stream disentanglement that separates environment understanding from self-state representation.
ALAS uses an environment learning module for spatial understanding (object functions, spatial relationships, and scene semantics) to enable transfer by disentangling environment and self.
It also includes a skill learning module that encodes motor patterns from self-state information, enabling transfer across skills through independent motor-pattern encoding.
Experiments on multiple long-horizon HSI tasks show ALAS improves average subtask success rate by 23% and average execution efficiency by 29% versus existing methods.

Abstract

Long-Horizon (LH) tasks in Human-Scene Interaction (HSI) are complex multi-step tasks that require continuous planning, sequential decision-making, and extended execution across domains to achieve the final goal. However, existing methods heavily rely on skill chaining by concatenating pre-trained subtasks, with environment observations and self-state tightly coupled, lacking the ability to generalize to new combinations of environments and skills, failing to complete various LH tasks across domains. To solve this problem, this paper presents ALAS, a cross-domain learning framework for LH tasks via biologically inspired dual-stream disentanglement. Inspired by the brain's "where-what" dual pathway mechanism, ALAS comprises two core modules: i) an environment learning module for spatial understanding, which captures object functions, spatial relationships, and scene semantics, achieving cross-domain transfer through complete environment-self disentanglement; ii) a skill learning module for task execution, which processes self-state information including joint degrees of freedom and motor patterns, enabling cross-skill transfer through independent motor pattern encoding. We conducted extensive experiments on various LH tasks in HSI scenes. Compared with existing methods, ALAS can achieve an average subtasks success rate improvement of 23\% and average execution efficiency improvement of 29\%.

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans

Dev.to

Why use an AI gateway at all?

Dev.to

OpenAI Just Named It Workspace Agents. We Open-Sourced Our Lark Version Six Months Ago

Dev.to

GPT Image 2 Subject-Lock Editing: A Practical Guide to input_fidelity

Dev.to

ALAS: Adaptive Long-Horizon Action Synthesis via Async-pathway Stream Disentanglement

Key Points

Abstract

Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans

Why use an AI gateway at all?

OpenAI Just Named It Workspace Agents. We Open-Sourced Our Lark Version Six Months Ago

GPT Image 2 Subject-Lock Editing: A Practical Guide to input_fidelity

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer