Diffusion Forcing for Multi-Agent Interaction Sequence Modeling

arXiv cs.RO / 3/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

本論文は、多人数（マルチエージェント）間の相互作用を長い時間軸・強い依存関係・グループサイズ変動という条件下で生成する課題に取り組む手法を提案しています。
提案モデルMAGNetは、複数人のモーション生成を扱うための統一的な自己回帰型拡散フレームワークで、柔軟な条件付けとサンプリングにより多様な相互作用タスク（2者・3者以上、inpainting、予測、エージェント生成など）を単一モデルで実行できます。
自己回帰的なノイズ除去過程でエージェント間の結合（inter-agent coupling）を明示的にモデル化することで、密に同期する活動と、より緩やかな社会的相互作用の双方で一貫した協調を実現します。
動画で示されるように、数百モーションステップに及ぶ超長系列の生成が可能であり、2者ベンチマークでは専用手法と同等性能、さらに多者シナリオへ自然に拡張できると報告されています。

Abstract

Understanding and generating multi-person interactions is a fundamental challenge with broad implications for robotics and social computing. While humans naturally coordinate in groups, modeling such interactions remains difficult due to long temporal horizons, strong inter-agent dependencies, and variable group sizes. Existing motion generation methods are largely task-specific and do not generalize to flexible multi-agent generation. We introduce MAGNet (Multi-Agent Generative Network), a unified autoregressive diffusion framework for multi-agent motion generation that supports a wide range of interaction tasks through flexible conditioning and sampling. MAGNet performs dyadic and polyadic prediction, partner inpainting, partner prediction, and agentic generation all within a single model, and can autoregressively generate ultra-long sequences spanning hundreds of motion steps. We explicitly model inter-agent coupling during autoregressive denoising, enabling coherent coordination across agents. As a result, MAGNet captures both tightly synchronized activities (e.g., dancing, boxing) and loosely structured social interactions. Our approach performs on par with specialized methods on dyadic benchmarks while naturally extending to polyadic scenarios involving three or more interacting people. Please watch the supplemental video, where the temporal dynamics and spatial coordination of generated interactions are best appreciated. Project page: https://von31.github.io/MAGNet/

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

I shipped Google's TurboQuant as a vLLM plugin 72 hours after the paper — here's what nobody else tested

Dev.to

We built a governance layer for AI-assisted development (with runtime validation and real system)

Dev.to

No AI system using the forward inference pass can ever be conscious.

Reddit r/artificial

What I wish I knew before running AI agents 24/7

Dev.to

Diffusion Forcing for Multi-Agent Interaction Sequence Modeling

Key Points

Abstract

Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

I shipped Google's TurboQuant as a vLLM plugin 72 hours after the paper — here's what nobody else tested

We built a governance layer for AI-assisted development (with runtime validation and real system)

No AI system using the forward inference pass can ever be conscious.

What I wish I knew before running AI agents 24/7

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer