ScaleEdit-12M: Scaling Open-Source Image Editing Data Generation via Multi-Agent Framework
arXiv cs.CV / 3/24/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces ScaleEditor, a fully open-source hierarchical multi-agent framework designed to generate large-scale, diverse, and high-quality instruction-based image editing datasets without relying on costly proprietary APIs.
- The end-to-end pipeline combines (1) source image expansion with world-knowledge infusion, (2) adaptive multi-agent instruction-image synthesis, and (3) task-aware data quality verification to improve edit realism and generalizability.
- Using ScaleEditor, the authors curate ScaleEdit-12M, reported as the largest open-source image editing dataset to date, covering 23 task families across both real and synthetic domains.
- Fine-tuning UniWorld-V1 and Bagel on ScaleEdit shows consistent performance improvements, including up to 10.4% on ImgEdit and 35.1% on GEdit for general editing benchmarks, and up to 150.0% on RISE and 26.5% on KRIS-Bench for knowledge-infused benchmarks.
- The authors claim the results suggest open-source agentic dataset pipelines can approach commercial-grade data quality while remaining cost-effective and scalable, and both the framework and dataset are planned to be open-sourced.
Related Articles

Black Hat Asia
AI Business

"The Agent Didn't Decide Wrong. The Instructions Were Conflicting — and Nobody Noticed."
Dev.to
Top 5 LLM Gateway Alternatives After the LiteLLM Supply Chain Attack
Dev.to

Stop Counting Prompts — Start Reflecting on AI Fluency
Dev.to

Reliable Function Calling in Deeply Recursive Union Types: Fixing Qwen Models' Double-Stringify Bug
Dev.to