Unified Number-Free Text-to-Motion Generation Via Flow Matching
arXiv cs.CV / 3/31/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces Unified Motion Flow (UMF) to generate multi-person motion from text without requiring a fixed number of agents, addressing poor generalization in existing motion generators.
- UMF separates the task into a single-pass motion prior generation stage (P-Flow) and multi-pass reaction generation stages, aiming to improve efficiency and reduce recursive error accumulation.
- P-Flow uses hierarchical resolutions conditioned on different noise levels to lower computational overhead while learning strong priors across motion data.
- S-Flow learns a joint probabilistic path for reaction transformation and context reconstruction, which the authors claim helps mitigate errors across iterative passes.
- Experiments and user studies are reported to show UMF’s effectiveness as a text-to-motion “generalist” for multi-person motion generation, and the project page provides additional materials.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.



