Learning Reactive Human Motion Generation from Paired Interaction Data Using Transformer-Based Models
arXiv cs.CV / 4/27/2026
📰 NewsModels & Research
Key Points
- The paper studies interactive human motion generation, aiming to predict one person’s motion conditioned on another person’s mutually dependent actions rather than single-agent motion.
- It builds a new dataset of paired action–reaction motion sequences extracted from boxing match videos and evaluates Transformer-based approaches on the task.
- Three Transformer variants are compared (a simple Transformer, iTransformer, and Crossformer), with findings that the simple Transformer produces plausible interaction-aware motions without posture collapse.
- iTransformer and Crossformer are reported to accumulate errors over time, resulting in unstable motion generation.
- The authors propose adding a person ID embedding to distinguish individuals explicitly, which helps maintain structural consistency and reduces the likelihood of structural collapse.
Related Articles

Subagents: The Building Block of Agentic AI
Dev.to

DeepSeek-V4 Models Could Change Global AI Race
AI Business

Got OpenAI's privacy filter model running on-device via ExecuTorch
Reddit r/LocalLLaMA

The Agent-Skill Illusion: Why Prompt-Based Control Fails in Multi-Agent Business Consulting Systems
Dev.to

We Built a Voice AI Receptionist in 8 Weeks — Every Decision We Made and Why
Dev.to