Commanding Humanoid by Free-form Language: A Large Language Action Model with Unified Motion Vocabulary
arXiv cs.RO / 4/13/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces Humanoid-LLA, a Large Language Action Model that converts free-form natural language into physically executable whole-body actions for humanoid robots.
- It proposes a unified motion vocabulary that maps human and humanoid motion primitives into a shared discrete space to improve motion diversity while preserving plausibility.
- A vocabulary-directed controller distilled from a privileged policy is used to maintain physical feasibility of the generated actions.
- The method includes physics-informed fine-tuning via reinforcement learning with dynamics-aware rewards to improve robustness and stability.
- Experiments in simulation and on Unitree G1 and Booster T1 humanoids indicate improved language generalization and better motion naturalness, stability, and execution success versus prior language-conditioned controllers.
Related Articles

Black Hat Asia
AI Business

I built the missing piece of the MCP ecosystem
Dev.to

When Agents Go Wrong: AI Accountability and the Payment Audit Trail
Dev.to

Google Gemma 4 Review 2026: The Open Model That Runs Locally and Beats Closed APIs
Dev.to

OpenClaw Deep Dive Guide: Self-Host Your Own AI Agent on Any VPS (2026)
Dev.to