Planning in entropy-regularized Markov decision processes and games
arXiv cs.LG / 4/22/2026
📰 News
Key Points
- The paper introduces SmoothCruiser, a new planning algorithm for estimating value functions in entropy-regularized Markov decision processes (MDPs) and two-player games using a generative environment model.