Value Functions for Temporal Logic: Optimal Policies and Safety Filters
arXiv cs.RO / 5/5/2026
📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research
Key Points
- The paper studies reach, avoid, and reach-avoid problems in the undiscounted infinite-horizon setting, showing that optimal value functions do not always imply optimal policies when maximizing the Q-function greedily.
- For reach-avoid tasks (equivalently Until specifications), the work finds that greedy Q-maximization can lead to policies that postpone task completion indefinitely despite value optimality.
- Building on recent decompositions of temporal-logic value functions, the authors construct history-dependent (non-Markovian) policies that avoid this failure mode and prove optimality for nested Until, Globally, and Globally-Until specifications under a quantitative robustness metric.
- The paper also demonstrates that the Q-function can be used as a safety filter for more complex temporal-logic specifications, generalizing beyond simpler reach/avoid settings.
Related Articles

Singapore's Fraud Frontier: Why AI Scam Detection Demands Regulatory Precision
Dev.to

First experience with Building Apps with Google AI Studio: Incredibly simple and intuitive.
Dev.to

Meta will use AI to analyze height and bone structure to identify if users are underage
TechCrunch

How AI is Changing the Way We Code in 2026: The Shift from Syntax to Strategy
Dev.to

13 CLAUDE.md Rules That Make AI Write Modern PHP (Not PHP 5 Resurrected)
Dev.to