Scalable Trajectory Generation for Whole-Body Mobile Manipulation

arXiv cs.RO / 4/15/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research

共有:

Key Points

複数の移動ベースとアームを同時に制御する「全身モバイルマニピュレーション」では、状態空間がシーンや物体多様性により組合せ的に爆発し、大規模で物理的に妥当な軌道データが必要になるが、従来は取得が労働集約的または計算的に困難だった。
AutoMoMaはGPU加速したフレームワークで、ベース・アーム・物体の運動学を単一チェーンとして統合するAKRモデリングと、並列化された軌道最適化を組み合わせ、大規模データ生成のボトルネックを解消する。
AutoMoMaはGPUあたり1時間で5,000エピソード、計50万超の物理的に有効な軌道を330シーン・多様な関節物体・複数ロボット機体にわたって生成し、CPU基準より大幅に高速（約80倍超）である。
さらに、生成データで学習した模倣学習（IL）では、単一の関節物体タスクでもSOTA級手法が約80%成功に到達するには数万デモが必要で、データ不足がアルゴリズム上の限界より支配的だったことを示した。

Abstract

Robots deployed in unstructured environments must coordinate whole-body motion -- simultaneously moving a mobile base and arm -- to interact with the physical world. This coupled mobility and dexterity yields a state space that grows combinatorially with scene and object diversity, demanding datasets far larger than those sufficient for fixed-base manipulation. Yet existing acquisition methods, including teleoperation and planning, are either labor-intensive or computationally prohibitive at scale. The core bottleneck is the lack of a scalable pipeline for generating large-scale, physically valid, coordinated trajectory data across diverse embodiments and environments. Here we introduce AutoMoMa, a GPU-accelerated framework that unifies AKR modeling, which consolidates base, arm, and object kinematics into a single chain, with parallelized trajectory optimization. AutoMoMa achieves 5,000 episodes per GPU-hour (over

80\times

faster than CPU-based baselines), producing a dataset of over 500k physically valid trajectories spanning 330 scenes, diverse articulated objects, and multiple robot embodiments. Prior datasets were forced to compromise on scale, diversity, or kinematic fidelity; AutoMoMa addresses all three simultaneously. Training downstream IL policies further reveals that even a single articulated-object task requires tens of thousands of demonstrations for SOTA methods to reach

\approx 80\%

success, confirming that data scarcity -- not algorithmic limitations -- has been the binding constraint. AutoMoMa thus bridges high-performance planning and reliable IL-based control, providing the infrastructure previously missing for coordinated mobile manipulation research. By making large-scale, kinematically valid training data practical, AutoMoMa showcases generalizable whole-body robot policies capable of operating in the diverse, unstructured settings of the real world.

Black Hat Asia

AI Business

The Complete Guide to Better Meeting Productivity with AI Note-Taking

Dev.to

5 Ways Real-Time AI Can Boost Your Sales Call Performance

Dev.to

Why dynamically routing multi-timescale advantages in PPO causes policy collapse (and a simple decoupled fix) [R]

Reddit r/MachineLearning

How AI Interview Assistants Are Changing Job Preparation in 2026

Dev.to

Scalable Trajectory Generation for Whole-Body Mobile Manipulation

Key Points

Abstract

Related Articles

Black Hat Asia

The Complete Guide to Better Meeting Productivity with AI Note-Taking

5 Ways Real-Time AI Can Boost Your Sales Call Performance

Why dynamically routing multi-timescale advantages in PPO causes policy collapse (and a simple decoupled fix) [R]

How AI Interview Assistants Are Changing Job Preparation in 2026

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer