Understanding Behavior Cloning with Action Quantization
arXiv cs.LG / 2026/3/24
💬 オピニオンIdeas & Deep AnalysisModels & Research
要点
- The paper studies behavior cloning for continuous control when actions must be discretized via action quantization, a common but not well-theorized technique used with autoregressive models like Transformers and VLAs.
- It analyzes how quantization error compounds over time (along the prediction horizon) and how this interacts with statistical sample complexity in training from expert demonstrations.
- The authors show that using behavior cloning with quantized actions and log-loss can achieve optimal sample complexity, matching known lower bounds, with only polynomial dependence on quantization error under stability and probabilistic smoothness assumptions.
- The paper compares quantization schemes by characterizing which ones satisfy or violate the required conditions, and introduces model-based augmentation that provably reduces error without relying on policy smoothness.
- It also derives fundamental limits that jointly quantify the trade-offs between quantization error and statistical complexity.

