StageCraft: Execution Aware Mitigation of Distractor and Obstruction Failures in VLA Models
arXiv cs.RO / 3/24/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper highlights that Vision Language Action (VLA) models can fail during execution when distractors or physical obstructions appear in the robot workspace, especially in unseen settings.
- It proposes StageCraft, a training-free, plug-and-play method that uses VLM in-context reasoning to modify the environment’s initial state to prevent anticipated execution failures.
- StageCraft takes policy rollout videos and success/failure labels, inferring which initial-state objects should be manipulated to avoid obstruction- or distractor-related breakdowns.
- Experiments report an absolute 40% performance improvement across three real-world domains with diverse distractors and obstructions, and RLBench simulation results show intervention strength adapts to the underlying policy and improves with more in-context samples.
広告
Related Articles

Got My 39-Agent System Audited Live. Here's What the Maturity Scorecard Revealed.
Dev.to

The Redline Economy
Dev.to

$500 GPU outperforms Claude Sonnet on coding benchmarks
Dev.to

From Scattershot to Sniper: AI for Hyper-Personalized Media Lists
Dev.to

The LiteLLM Supply Chain Attack: A Wake-Up Call for AI Infrastructure
Dev.to