From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space
arXiv cs.CV / 3/16/2026
💬 OpinionModels & Research
Key Points
- MV-GRPO extends Group Relative Policy Optimization by augmenting the condition space with a Condition Enhancer to generate semantically adjacent yet diverse captions, enabling dense multi-view reward mapping for T2I flow models.
- The approach targets the limitation of single-view evaluation, which underexplores inter-sample relationships and can cap alignment performance.
- It computes the original samples' probability distribution conditioned on the new captions and incorporates these signals into training without requiring costly sample regeneration.
- Experimental results show MV-GRPO achieves superior alignment performance compared with state-of-the-art methods.
Related Articles

Interesting loop
Reddit r/LocalLLaMA
Qwen3.5-122B-A10B Uncensored (Aggressive) — GGUF Release + new K_P Quants
Reddit r/LocalLLaMA
FeatherOps: Fast fp8 matmul on RDNA3 without native fp8
Reddit r/LocalLLaMA

VerityFlow-AI: Engineering a Multi-Agent Swarm for Real-Time Truth-Validation and Deep-Context Media Synthesis
Dev.to
: [R] Sinc Reconstruction for LLM Prompts: Applying Nyquist-Shannon to the Specification Axis (275 obs, 97% cost reduction, open source)
Reddit r/MachineLearning