AI Navigate

LingoMotion: An Interpretable and Unambiguous Symbolic Representation for Human Motion

arXiv cs.CV / 3/17/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • LingoMotion introduces a symbolic, interpretable motion language that aims to replace opaque latent representations for human motion.
  • It defines a motion alphabet based on joint angles and builds words, phrases, and syntax to describe both simple actions like walking and complex activities.
  • The approach is evaluated on a large-scale Motion-X dataset, showing high fidelity in motion representation and interpretability.
  • The work envisions applications in areas such as animation, robotics, and human–machine interaction by enabling more transparent and compositional motion descriptions.

Abstract

Existing representations for human motion, such as MotionGPT, often operate as black-box latent vectors with limited interpretability and build on joint positions which can cause ambiguity. Inspired by the hierarchical structure of natural languages - from letters to words, phrases, and sentences - we propose LingoMotion, a motion language that facilitates interpretable and unambiguous symbolic representation for both simple and complex human motion. In this paper, we introduce the concept design of LingoMotion, including the definitions of motion alphabet based on joint angles, the morphology for forming words and phrases to describe simple actions like walking and their attributes like speed and scale, as well as the syntax for describing more complex human activities with sequences of words and phrases. The preliminary results, including the implementation and evaluation of motion alphabet using a large-scale motion dataset Motion-X, demonstrate the high fidelity of motion representation.