GraSP-STL: A Graph-Based Framework for Zero-Shot Signal Temporal Logic Planning via Offline Goal-Conditioned Reinforcement Learning
arXiv cs.RO / 4/1/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces GraSP-STL, a graph-search-based framework for offline, zero-shot planning under Signal Temporal Logic (STL) specifications.
- It assumes only an offline dataset of state-action-state transitions from a task-agnostic behavior policy, with no dynamics model, no additional environment interaction, and no task-specific retraining.
- GraSP-STL learns a goal-conditioned value function from offline data to derive a finite-horizon reachability metric, then builds a directed state-graph abstraction whose edges represent feasible short-horizon transitions.
- Planning is performed as a graph search over waypoint sequences, evaluated using arithmetic-geometric mean robustness with interval semantics, and then executed via the learned goal-conditioned policy.
- The framework is designed to decouple reusable reachability learning from task-conditioned planning, enabling generalization to unseen STL tasks and longer-horizon behavior composition using short-horizon offline segments.
Related Articles

Black Hat Asia
AI Business

Knowledge Governance For The Agentic Economy.
Dev.to

AI server farms heat up the neighborhood for miles around, paper finds
The Register

Paperclip: Công Cụ Miễn Phí Biến AI Thành Đội Phát Triển Phần Mềm
Dev.to
Does the Claude “leak” actually change anything in practice?
Reddit r/LocalLLaMA