PlayGen-MoG: Framework for Diverse Multi-Agent Play Generation via Mixture-of-Gaussians Trajectory Prediction

arXiv cs.AI / 4/6/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • PlayGen-MoG is introduced as a formation-conditioned multi-agent play generation framework for team sports that aims to produce diverse, realistic coordinated trajectories.
  • The method addresses common generative failures (e.g., posterior collapse and mode averaging) by using a Mixture-of-Gaussians output head with shared mixture weights across all agents to jointly select coupled play scenarios.
  • It incorporates relative spatial attention that learns pairwise player positions and distances via attention biases to improve spatial coordination.
  • Unlike forecasting approaches that require multiple observed history frames, PlayGen-MoG uses non-autoregressive prediction of absolute displacements from a single static initial formation to avoid cumulative error drift.
  • Experiments on American football tracking data report improved accuracy (1.68 yard ADE, 3.98 yard FDE) while preserving mixture utilization and qualitative evidence of diversity without mode collapse.

Abstract

Multi-agent trajectory generation in team sports requires models that capture both the diversity of possible plays and realistic spatial coordination between players on plays. Standard generative approaches such as Conditional Variational Autoencoders (CVAE) and diffusion models struggle with this task, exhibiting posterior collapse or convergence to the dataset mean. Moreover, most trajectory prediction methods operate in a forecasting regime that requires multiple frames of observed history, limiting their use for play design where only the initial formation is available. We present PlayGen-MoG, an extensible framework for formation-conditioned play generation that addresses these challenges through three design choices: 1/ a Mixture-of-Gaussians (MoG) output head with shared mixture weights across all agents, where a single set of weights selects a play scenario that couples all players' trajectories, 2/ relative spatial attention that encodes pairwise player positions and distances as learned attention biases, and 3/ non-autoregressive prediction of absolute displacements from the initial formation, eliminating cumulative error drift and removing the dependence on observed trajectory history, enabling realistic play generation from a single static formation alone. On American football tracking data, PlayGen-MoG achieves 1.68 yard ADE and 3.98 yard FDE while maintaining full utilization of all 8 mixture components with entropy of 2.06 out of 2.08, and qualitatively confirming diverse generation without mode collapse.