Low-Burden LLM-Based Preference Learning: Personalizing Assistive Robots from Natural Language Feedback for Users with Paralysis

arXiv cs.RO / 4/3/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses how physically assistive robots need individualized behaviors, noting that conventional preference-learning methods can overload users with profound motor impairments through heavy pairwise comparisons.
  • It proposes a low-burden offline framework that converts unstructured natural language feedback into deterministic robotic control policies using LLMs grounded in the Occupational Therapy Practice Framework (OTPF).
  • To handle ambiguity in speech-to-code translation, the pipeline performs clinical reasoning to convert subjective reactions into explicit physical and psychological requirements, which are then represented as transparent decision trees.
  • An automated “LLM-as-a-Judge” step checks the structural safety of the generated policy code before deployment.
  • In a simulated meal-preparation study with 10 adults with paralysis, the approach reduced user workload versus baselines, and clinical experts judged the resulting policies as safe and preference-accurate.

Abstract

Physically Assistive Robots (PARs) require personalized behaviors to ensure user safety and comfort. However, traditional preference learning methods, like exhaustive pairwise comparisons, cause severe physical and cognitive fatigue for users with profound motor impairments. To solve this, we propose a low-burden, offline framework that translates unstructured natural language feedback directly into deterministic robotic control policies. To safely bridge the gap between ambiguous human speech and robotic code, our pipeline uses Large Language Models (LLMs) grounded in the Occupational Therapy Practice Framework (OTPF). This clinical reasoning decodes subjective user reactions into explicit physical and psychological needs, which are then mapped into transparent decision trees. Before deployment, an automated "LLM-as-a-Judge" verifies the code's structural safety. We validated this system in a simulated meal preparation study with 10 adults with paralysis. Results show our natural language approach significantly reduces user workload compared to traditional baselines. Additionally, independent clinical experts confirmed the generated policies are safe and accurately reflect user preferences.