PhysMoDPO: Physically-Plausible Humanoid Motion with Preference Optimization
arXiv cs.LG / 3/16/2026
📰 NewsModels & Research
Key Points
- The paper introduces PhysMoDPO, a Direct Preference Optimization framework that trains diffusion-based motion models by using preferences derived from physics-based and task-specific rewards.
- It integrates a Whole-Body Controller (WBC) into the training pipeline to ensure that generated motions are executable while respecting text instructions, reducing reliance on hand-crafted physics heuristics.
- The approach optimizes the diffusion model so that the WBC output is simultaneously compliant with physics and faithful to the original motion instructions, improving physical realism and task performance.
- Experiments on text-to-motion and spatial control tasks show consistent improvements in physical realism and downstream metrics, including enhanced zero-shot motion transfer and successful real-world deployment on a G1 humanoid robot.
Related Articles

報告:LLMにおける「自己言及的再帰」と「ステートフル・エミュレーション」の観測
note

諸葛亮 孔明老師(ChatGPTのロールプレイ)との対話 その肆拾伍『銀河文明・ダークマターエンジン』
note

GPT-5.4 mini/nano登場!―2倍高速で無料プランも使える小型高性能モデル
note

Why a Perfect-Memory AI Agent Without Persona Drift is Architecturally Impossible
Dev.to
OCP: Orthogonal Constrained Projection for Sparse Scaling in Industrial Commodity Recommendation
arXiv cs.LG