AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models

arXiv cs.LG / 3/20/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

AcceRL proposes a fully asynchronous and decoupled RL framework that separates training, inference, and rollouts to remove synchronization bottlenecks in Vision-Language-Action models.
It is the first to integrate a plug-and-play, trainable world model into a distributed asynchronous RL pipeline to generate virtual experiences.
Experiments on the LIBERO benchmark show that AcceRL achieves state-of-the-art performance.
The framework exhibits super-linear scaling in throughput and highly efficient hardware utilization.
The world-model-augmented variant delivers unprecedented sample efficiency and robust training stability in complex control tasks.

Abstract

Reinforcement learning (RL) for large-scale Vision-Language-Action (VLA) models faces significant challenges in computational efficiency and data acquisition. We propose AcceRL, a fully asynchronous and decoupled RL framework designed to eliminate synchronization barriers by physically isolating training, inference, and rollouts. Crucially, AcceRL is the first to integrate a plug-and-play, trainable world model into a distributed asynchronous RL pipeline to generate virtual experiences. Experiments on the LIBERO benchmark demonstrate that AcceRL achieves state-of-the-art (SOTA) performance. Systematically, it exhibits super-linear scaling in throughput and highly efficient hardware utilization. Algorithmically, the world-model-augmented variant delivers unprecedented sample efficiency and robust training stability in complex control tasks.

Interactive Web Visualization of GPT-2

Reddit r/artificial

From infrastructure to AI: how Alibaba Cloud powers the global ambitions of Chinese companies

SCMP Tech

[R] Causal self-attention as a probabilistic model over embeddings

Reddit r/MachineLearning

The 5 software development trends that actually matter in 2026 (and what they mean for your startup)

Dev.to

33 LangChain Alternatives That Won't Leak Your Data (2026 Guide)

Dev.to

AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models

Key Points

Abstract

Related Articles

Interactive Web Visualization of GPT-2

From infrastructure to AI: how Alibaba Cloud powers the global ambitions of Chinese companies

[R] Causal self-attention as a probabilistic model over embeddings

The 5 software development trends that actually matter in 2026 (and what they mean for your startup)

33 LangChain Alternatives That Won't Leak Your Data (2026 Guide)

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer