Risk-Controllable Multi-View Diffusion for Driving Scenario Generation
arXiv cs.CV / 3/13/2026
📰 NewsModels & Research
Key Points
- RiskMV-DPO is a general pipeline enabling physically-informed, risk-controllable generation of multi-view driving scenarios by conditioning diffusion-based video synthesis on target risk levels and grounded risk modeling.
- The approach adds a geometry-appearance alignment module and a region-aware direct preference optimization (RA-DPO) with motion-aware masking to ensure spatial-temporal coherence and focus learning on dynamic regions.
- On the nuScenes dataset, RiskMV-DPO generates diverse long-tail scenarios while achieving state-of-the-art visual quality, increasing 3D detection mAP from 18.17 to 30.50 and reducing FID to 15.70.
- This work shifts world models from passive environment prediction to proactive, risk-controllable synthesis, offering a scalable toolchain for safety-oriented embodied intelligence development.
Related Articles

14 Best Self-Hosted Claude Alternatives for AI and Coding in 2026
Dev.to
[P] Finetuned small LMs to VLM adapters locally and wrote a short article about it
Reddit r/MachineLearning
Experiment: How far can a 28M model go in business email generation?
Reddit r/LocalLLaMA

Qwen 3.5 397b (180gb) scores 93% on MMLU
Reddit r/LocalLLaMA
Qwen 3.5 27B - quantize KV cache or not?
Reddit r/LocalLLaMA