CodeGraphVLP: Code-as-Planner Meets Semantic-Graph State for Non-Markovian Vision-Language-Action Models
arXiv cs.RO / 4/27/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- CodeGraphVLP targets vision-language-action (VLA) robotics problems where long-horizon, non-Markovian environments require remembering earlier evidence that may later be occluded or hidden.
- The framework combines a persistent semantic-graph state (tracking relevant entities and relations under partial observability) with an executable code-based hierarchical planner for subtask generation and progress checking.
- It uses the planner’s subtask instructions and identified objects to create clutter-suppressed observations, improving visual grounding and reducing distraction for the VLA executor.
- Experiments on real-world non-Markovian tasks show higher task completion than strong VLA baselines and history-enabled variants, while also reducing planning latency compared with VLM-in-the-loop approaches.
- Extensive ablation studies validate the specific contribution of each component in the hierarchical semantic-graph + code-planner + progress-guided prompting pipeline.
Related Articles

Legal Insight Transformation: 7 Mistakes to Avoid When Adopting AI Tools
Dev.to

Legal Insight Transformation: Traditional vs. AI-Driven Research Compared
Dev.to

Legal Insight Transformation: A Beginner's Guide to Modern Research
Dev.to
I tested the same prompt across multiple AI models… the differences surprised me
Reddit r/artificial

The five loops between AI coding and AI engineering
Dev.to