You've Got a Golden Ticket: Improving Generative Robot Policies With A Single Noise Vector

arXiv cs.RO / 4/13/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper shows that pretrained generative robot policies using diffusion or flow matching can be improved by replacing stochastic initial noise sampling from a Gaussian with a single, well-chosen constant noise vector (“golden ticket”).
  • It introduces a Monte-Carlo search method that keeps the pretrained policy frozen, trains no new networks, and relies only on injecting initial noise and evaluating sparse task rewards from rollouts.
  • Across 38 of 43 robot manipulation tasks (simulated and real-world), golden tickets yield relative success-rate gains of up to 58% in simulation and up to 60% within 50 search episodes in real-world settings.
  • The authors find golden tickets also provide benefits in multi-task scenarios, where different tickets’ behavior diversity forms a Pareto frontier and a ticket optimized for one task can help related tasks in VLA settings.
  • A codebase is released with pretrained policies and golden tickets for simulation benchmarks spanning VLAs, diffusion policies, and flow matching policies.

Abstract

What happens when a pretrained generative robot policy is provided a constant initial noise as input, rather than repeatedly sampling it from a Gaussian? We demonstrate that the performance of a pretrained, frozen diffusion or flow matching policy can be improved with respect to a downstream reward by swapping the sampling of initial noise from the prior distribution (typically isotropic Gaussian) with a well-chosen, constant initial noise input -- a golden ticket. We propose a search method to find golden tickets using Monte-Carlo policy evaluation that keeps the pretrained policy frozen, does not train any new networks, and is applicable to all diffusion/flow matching policies (and therefore many VLAs). Our approach to policy improvement makes no assumptions beyond being able to inject initial noise into the policy and calculate (sparse) task rewards of episode rollouts, making it deployable with no additional infrastructure or models. Our method improves the performance of policies in 38 out of 43 tasks across simulated and real-world robot manipulation benchmarks, with relative improvements in success rate by up to 58% for some simulated tasks, and 60% within 50 search episodes for real-world tasks. We also show unique benefits of golden tickets for multi-task settings: the diversity of behaviors from different tickets naturally defines a Pareto frontier for balancing different objectives (e.g., speed, success rates); in VLAs, we find that a golden ticket optimized for one task can also boost performance in other related tasks. We release a codebase with pretrained policies and golden tickets for simulation benchmarks using VLAs, diffusion policies, and flow matching policies.