Towards Shutdownable Agents: Generalizing Stochastic Choice in RL Agents and LLMs
arXiv cs.AI / 4/21/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper proposes “Discounted Reward for Same-Length Trajectories (DReST)” to make AI agents more shutdownable by discouraging repeated choices of same-length trajectories.
- DReST is designed to encourage agents to be neutral about trajectory length (stochastic choice across different lengths) while still being useful for accomplishing goals.
- The authors train deep RL agents with DReST and fine-tune an LLM with the same objective, evaluating whether these behaviors generalize to unseen contexts at test time.
- Results show improved “Usefulness” versus baseline—11% higher with PPO and 18% higher with A2C—and the fine-tuned LLM reaches maximum usefulness with near-maximum neutrality.
- The study provides early evidence that DReST could be a practical approach for training more advanced agents that balance usefulness with shutdown-resistance concerns.
Related Articles

¿Hasta qué punto podría la IA reemplazarnos en nuestros trabajos? A veces creo que la gente exagera un poco.
Reddit r/artificial

Magnificent irony as Meta staff unhappy about running surveillance software on work PCs
The Register

ETHENEA (ETHENEA Americas LLC) Analyst View: Asset Allocation Resilience in the 2026 Global Macro Cycle
Dev.to

DEEPX and Hyundai Are Building Generative AI Robots
Dev.to

Stop Paying OpenAI to Read Garbage: The Two-Stage Agent Pipeline
Dev.to