MOMO: A framework for seamless physical, verbal, and graphical robot skill learning and adaptation
arXiv cs.CL / 4/23/2026
💬 OpinionDeveloper Stack & InfrastructureModels & Research
Key Points
- The MOMO framework targets industrial robots that need to be flexibly adapted by non-expert users using three interaction modalities: kinesthetic touch, natural language, and a graphical web UI.
- It combines energy-based human-intention detection with a “tool-based LLM” approach that selects and parameterizes predefined functions for safer natural-language skill adaptation rather than generating code.
- For motion and learning, MOMO uses Kernelized Movement Primitives (KMPs) to encode robot skills and probabilistic Virtual Fixtures to guide demonstration recording.
- The method integrates control techniques for finishing tasks, including probabilistic guidance and ergodic control, and demonstrates voice-commanded surface finishing by generalizing adaptation from KMPs to ergodic control.
- A validation on a 7-DoF torque-controlled robot at the Automatica 2025 trade fair supports the claimed practical applicability in industrial environments.
Related Articles

Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans
Dev.to

Elevating Austria: Google invests in its first data center in the Alps.
Google Blog

OpenAI Just Named It Workspace Agents. We Open-Sourced Our Lark Version Six Months Ago
Dev.to

GPT Image 2 Subject-Lock Editing: A Practical Guide to input_fidelity
Dev.to

AI Tutor That Works Offline — Study Anywhere with EaseLearn AI
Dev.to