SOMA: Strategic Orchestration and Memory-Augmented System for Vision-Language-Action Model Robustness via In-Context Adaptation
arXiv cs.RO / 3/26/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces SOMA, a memory- and attribution-driven orchestration framework designed to improve Vision-Language-Action (VLA) model robustness to perceptual noise and out-of-distribution (OOD) environments without parameter fine-tuning.
- SOMA upgrades frozen VLA policies using an online pipeline that combines Dual-Memory Retrieval-Augmented Generation (RAG), an Attribution-Driven LLM orchestrator, and flexible MCP-based intervention mechanisms.
- An offline Memory Consolidation module distills execution traces into reliable priors to support better long-term decision consistency.
- Experiments on LIBERO-PRO and new LIBERO-SOMA benchmarks across pi0, pi0.5, and SmolVLA show an average absolute success rate gain of 56.6%, including a 89.1% improvement for long-horizon task chaining.
- The authors provide a project page and open-source code to enable reproducibility and further experimentation with the proposed system.
Related Articles
Regulating Prompt Markets: Securities Law, Intellectual Property, and the Trading of Prompt Assets
Dev.to
Mercor competitor Deccan AI raises $25M, sources experts from India
Dev.to
How We Got Local MCP Servers Working in Claude Cowork (The Missing Guide)
Dev.to
How Should Students Document AI Usage in Academic Work?
Dev.to

I asked my AI agent to design a product launch image. Here's what came back.
Dev.to