From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space
arXiv cs.CV / 3/16/2026
💬 OpinionModels & Research
Key Points
- MV-GRPO extends Group Relative Policy Optimization by augmenting the condition space with a Condition Enhancer to generate semantically adjacent yet diverse captions, enabling dense multi-view reward mapping for T2I flow models.
- The approach targets the limitation of single-view evaluation, which underexplores inter-sample relationships and can cap alignment performance.
- It computes the original samples' probability distribution conditioned on the new captions and incorporates these signals into training without requiring costly sample regeneration.
- Experimental results show MV-GRPO achieves superior alignment performance compared with state-of-the-art methods.
Related Articles

PearlOS. We gave swarm intelligence a local desktop environment and code control to self-evolve. Has been pretty incredible to see so far. Open source and free if you want your own.
Reddit r/LocalLLaMA
QwenDean-4B | fine-tuned SLM for UIGen; our first attempt, looking for feedback!
Reddit r/LocalLLaMA
acestep.cpp: portable C++17 implementation of ACE-Step 1.5 music generation using GGML. Runs on CPU, CUDA, ROCm, Metal, Vulkan
Reddit r/LocalLLaMA

**Introducing SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding**
Hugging Face Blog

Newest GPU server in the lab! 72gb ampere vram!
Reddit r/LocalLLaMA