TIGFlow-GRPO: Trajectory Forecasting via Interaction-Aware Flow Matching and Reward-Driven Optimization
arXiv cs.AI / 3/27/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces TIGFlow-GRPO, a two-stage framework for human trajectory forecasting that explicitly aligns generated trajectories with behavioral rules and scene constraints rather than relying mainly on supervised fitting.
- In the first stage, it builds a Conditional Flow Matching (CFM) predictor enhanced with a Trajectory-Interaction-Graph (TIG) module to better encode agent–agent and agent–scene interactions from spatio-temporal observations.
- In the second stage, it applies a Flow-GRPO post-training approach by converting deterministic flow rollout into stochastic ODE-to-SDE sampling to encourage exploration of multimodal futures.
- Training uses a composite reward combining view-aware social compliance and map-aware physical feasibility, with GRPO progressively steering predictions toward behaviorally plausible outcomes.
- Experiments on ETH/UCY and SDD demonstrate improved forecasting accuracy, more stable long-horizon behavior, and trajectories that are both socially compliant and physically feasible.




