A Semantic Autonomy Framework for VLM-Integrated Indoor Mobile Robots: Hybrid Deterministic Reasoning and Cross-Robot Adaptive Memory
arXiv cs.RO / 5/5/2026
📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research
Key Points
- The paper addresses indoor mobile robots’ inability to follow natural-language intent instructions, proposing a framework that integrates vision-language reasoning with existing navigation stacks like ROS 2 Navigation 2.
- It introduces a “Semantic Autonomy Stack” with hybrid deterministic reasoning plus VLM reasoning, using a 7-step parametric resolver to handle most instructions quickly without calling a language model, camera, or GPU.
- Ambiguous instructions only trigger slower VLM-based reasoning (2–9 seconds on consumer hardware), improving practical deployment latency while retaining semantic understanding.
- To overcome session-by-session amnesia, it adds a cross-robot adaptive semantic memory system with explicit scope categories (global environment, operator preferences, robot capabilities) that transfers learned preferences between robots.
- Experiments on two differential-drive robots on Raspberry Pi 5 (no onboard GPU) report 100% semantic transfer and resolution accuracy across multiple sessions, along with concurrent multi-robot feasibility and an extreme measured latency reduction via deterministic resolution and shared compiled digests.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.
Related Articles

Backed by Y Combinator and 20 unicorn founders, Moritz lands $9M
Tech.eu

Why Retail Chargeback Recovery Could Be AgentHansa's First Real PMF
Dev.to

Anthropic Launches AI Services Company with Blackstone & Goldman Sachs
Dev.to

Why B2B Revenue-Recovery Casework Looks Like AgentHansa's Best Early PMF
Dev.to

10 Ways AI Has Become Your Invisible Daily Companion in 2026
Dev.to