RESample: A Robust Data Augmentation Framework via Exploratory Sampling for Robotic Manipulation

arXiv cs.RO / 4/13/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • RESample is proposed as an automated data augmentation framework for Vision-Language-Action (VLA) robotic manipulation, targeting limited distribution coverage in imitation learning datasets.
  • The method uses an exploratory sampling mechanism to detect coverage gaps during policy rollout and collect exploratory actions to extend training coverage efficiently, improving robustness to out-of-distribution (OOD) deployments.
  • A lightweight Coverage Function estimates coverage density of states in the training dataset, guiding exploration toward low-coverage regions.
  • Experiments on the LIBERO benchmark and real-world robotic tasks show a reported 12% performance improvement over baselines while requiring only about 10–20% additional samples.

Abstract

Vision-Language-Action (VLA) models have demonstrated remarkable performance on complex tasks through imitation learning in recent robotic manipulation works. Based on large-scale and high-quality demonstration datasets, existing imitation learning method arms VLA models acquired with strong capabilities. However, these datasets that predominantly consist of successful trajectories, are costly to collect and often limited in distribution, leading to capability bottlenecks when faced with out-of-distribution (OOD) scenarios during deployment while unable to recover. To address this issue, we propose an automated data augmentation framework named RESample that effectively improves the distribution coverage of VLA training datasets through the well-designed exploratory sampling mechanism. Specifically, the exploratory sampling mechanism identifies the potential coverage gaps during the policy rollout and actively samples exploratory actions to extend the coverage of training data with high sample efficiency. Furthermore, to effectively reflect the distribution of the training dataset, we propose a lightweight Coverage Function that indicates the coverage density of states in the training dataset, which further guides the exploratory sampling process to focus on low-coverage regions. To validate the effectiveness of our method, we conduct extensive experiments on the LIBERO benchmark as well as a series of real-world robotic tasks, demonstrating a significant performance gain of 12% of our proposed RESample over baselines, with only 10-20% additional samples compared to original training data.