Ro-SLM: Onboard Small Language Models for Robot Task Planning and Operation Code Generation

arXiv cs.RO / 4/14/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

本論文は、クラウド依存や高性能計算資源への依存を避けるため、ロボットのオンボード環境で動作可能な小型言語モデル（SLM）によるタスク計画とコード生成を可能にする枠組み「Ro-SLM」を提案しています。
Ro-SLMは、LLMを用いたデータセット合成（多様な指示の生成、最小限の人手での正解コード生成、実環境に近いシナリオへの指示拡張）を起点に、SLMを微調整します。
学習では、LLMを報酬関数として用いてトレーニングを誘導し、知識・推論能力を蒸留することでオンボードSLMの性能を引き上げます。
UAVの操作タスクでの実験により、Ro-SLMは当初はロボットのタスク計画・コード生成ができなかったSLMを改善し、LLMに近い性能に到達することを示しています。

Abstract

Recent advances in large language models (LLMs) provide robots with contextual reasoning abilities to comprehend human instructions. Yet, current LLM-enabled robots typically depend on cloud-based models or high-performance computing infrastructure, which limit their deployment on robots under unreliable internet environments or with constrained computational resources, such as UAVs and small ground vehicles. Thus, deploying fine-tuned small language models (SLMs) that support onboard deployment offers a promising alternative. This paper introduces Ro-SLM, a framework that enables reliable SLM-driven robot operation by distilling LLMs' knowledge and reasoning. Ro-SLM starts from dataset synthesis by leveraging LLMs to generate diverse task instructions, produce corresponding ground truth code with minimal human assistance, and augment instructions into real-world application scenarios. Ro-SLM is then fine-tuned with the dataset, in which LLM serves as a reward function to guide the training. Extensive experiments on UAV operation tasks demonstrate that Ro-SLM improves the performance of SLM from being incapable of supporting robotic task planning and code generation to achieving performance that approaches LLM.