The Model Knows, the Decoder Finds: Future Value Guided Particle Power Sampling
arXiv cs.AI / 5/5/2026
📰 NewsModels & Research
Key Points
- The paper addresses a key challenge in training-free “reasoning without training”: base LLMs already put some probability on correct multi-step solutions, but inference-time decoding must efficiently find those probability modes.
- It proposes Auxiliary Particle Power Sampling (APPS), which approximates a sequence-level power target proportional to p_theta(x)^alpha (with alpha > 1) using a bounded number of particles in a blockwise, parallel way.
- APPS uses proposal-corrected power reweighting and future-value-guided selection at resampling boundaries to allocate compute across competing prefixes instead of committing to one decoding path.
- The method includes practical future-value estimates via short-horizon rollouts and an amortized variant that uses a lightweight learned selection head to reduce overhead.
- Experiments on reasoning benchmarks show APPS improves the accuracy–runtime trade-off for training-free decoding and indicates that more of the gap to post-trained systems can be recovered via better inference-time power approximation.
Related Articles

When Claims Freeze Because a Provider Record Drifted: The Case for Enrollment Repair Agents
Dev.to

The Refund Buried in Export Paperwork: Why Customs Drawback Claim Assembly Fits an Agent Better Than Another Research Bo
Dev.to

Gemini File Generation Guide: How to Create PDFs, Word Docs & Excel Files with AI (2026)
Dev.to

How an AI Agent Executed 500+ Real-World Operations and Built Its Own Recovery Engine
Dev.to
Qwen 3.6 27B MTP on v100 32GB: 54 t/s
Reddit r/LocalLLaMA