Towards Generalizable Robotic Data Flywheel: High-Dimensional Factorization and Composition

arXiv cs.RO / 3/27/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that generalist robotic models are bottlenecked by insufficient diversity and limited data efficiency, and that current approaches lack systematic strategies for collecting and curating broadly useful data.
  • It proposes F-ACIL, a factor-aware compositional iterative learning framework that factorizes the data distribution into structured factor spaces (e.g., object, action, environment) to handle implicit, sparsely distributed task diversity.
  • F-ACIL introduces factor-wise data collection together with an iterative training loop designed to improve compositional generalization across a high-dimensional factor space.
  • In extensive real-world experiments, the method reportedly delivers over 45% performance gains while requiring 5–10× fewer demonstrations compared with approaches that do not use the factorization/composition strategy.
  • The authors position structured factorization as a practical route toward building more efficient “robotic data flywheel” pipelines for generalizable robotic learning.

Abstract

The lack of sufficiently diverse data, coupled with limited data efficiency, remains a major bottleneck for generalist robotic models, yet systematic strategies for collecting and curating such data are not fully explored. Task diversity arises from implicit factors that are sparsely distributed across multiple dimensions and are difficult to define explicitly. To address this challenge, we propose F-ACIL, a heuristic factor-aware compositional iterative learning framework that enables structured data factorization and promotes compositional generalization. F-ACIL decomposes the data distribution into structured factor spaces such as object, action, and environment. Based on the factorized formulation, we develop a factor-wise data collection and an iterative training paradigm that promotes compositional generalization over the high-dimensional factor space, leading to more effective utilization of real-world robotic demonstrations. With extensive real-world experiments, we show that F-ACIL can achieve more than 45% performance gains with 5-10\times fewer demonstrations comparing to that of which without the strategy. The results suggest that structured factorization offers a practical pathway toward efficient compositional generalization in real-world robotic learning. We believe F-ACIL can inspire more systematic research on building generalizable robotic data flywheel strategies. More demonstrations can be found at: https://f-acil.github.io/