PhysMoDPO: Physically-Plausible Humanoid Motion with Preference Optimization
arXiv cs.LG / 3/16/2026
📰 NewsModels & Research
Key Points
- The paper introduces PhysMoDPO, a Direct Preference Optimization framework that trains diffusion-based motion models by using preferences derived from physics-based and task-specific rewards.
- It integrates a Whole-Body Controller (WBC) into the training pipeline to ensure that generated motions are executable while respecting text instructions, reducing reliance on hand-crafted physics heuristics.
- The approach optimizes the diffusion model so that the WBC output is simultaneously compliant with physics and faithful to the original motion instructions, improving physical realism and task performance.
- Experiments on text-to-motion and spatial control tasks show consistent improvements in physical realism and downstream metrics, including enhanced zero-shot motion transfer and successful real-world deployment on a G1 humanoid robot.
Related Articles

Interesting loop
Reddit r/LocalLLaMA
Qwen3.5-122B-A10B Uncensored (Aggressive) — GGUF Release + new K_P Quants
Reddit r/LocalLLaMA
FeatherOps: Fast fp8 matmul on RDNA3 without native fp8
Reddit r/LocalLLaMA

VerityFlow-AI: Engineering a Multi-Agent Swarm for Real-Time Truth-Validation and Deep-Context Media Synthesis
Dev.to
: [R] Sinc Reconstruction for LLM Prompts: Applying Nyquist-Shannon to the Specification Axis (275 obs, 97% cost reduction, open source)
Reddit r/MachineLearning