AI Navigate

IROSA: Interactive Robot Skill Adaptation using Natural Language

arXiv cs.CL / 3/16/2026

💬 OpinionTools & Practical UsageIndustry & Market MovesModels & Research

Key Points

  • The paper introduces IROSA, a framework for open-vocabulary skill adaptation in robotics using a tool-based architecture with a protective abstraction layer between the language model and robot hardware.
  • It relies on pre-trained LLMs to select and parameterize specific tools to adapt robot skills without fine-tuning or direct model-to-robot interaction.
  • The approach is demonstrated on a 7-DoF torque-controlled robot performing an industrial bearing ring insertion task, enabling natural-language commands for speed, trajectory adjustments, and obstacle avoidance while emphasizing safety and interpretability.
  • The work targets practical industrial deployment by integrating foundation models with imitation learning, addressing challenges around safety, transparency, and deployability.

Abstract

Foundation models have demonstrated impressive capabilities across diverse domains, while imitation learning provides principled methods for robot skill adaptation from limited data. Combining these approaches holds significant promise for direct application to robotics, yet this combination has received limited attention, particularly for industrial deployment. We present a novel framework that enables open-vocabulary skill adaptation through a tool-based architecture, maintaining a protective abstraction layer between the language model and robot hardware. Our approach leverages pre-trained LLMs to select and parameterize specific tools for adapting robot skills without requiring fine-tuning or direct model-to-robot interaction. We demonstrate the framework on a 7-DoF torque-controlled robot performing an industrial bearing ring insertion task, showing successful skill adaptation through natural language commands for speed adjustment, trajectory correction, and obstacle avoidance while maintaining safety, transparency, and interpretability.