VeraRetouch: A Lightweight Fully Differentiable Framework for Multi-Task Reasoning Photo Retouching
arXiv cs.CV / 5/1/2026
📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research
Key Points
- The paper proposes VeraRetouch, a lightweight, fully differentiable multi-task reasoning framework for photo retouching that can jointly analyze defects, produce reasoning plans, and apply precise edits.
- It uses a compact 0.5B vision-language model to generate retouching plans from instructions and scene semantics, and replaces external non-differentiable tools with a fully differentiable Retouch Renderer for end-to-end pixel-level training.
- The Retouch Renderer is trained with decoupled control latents for lighting, global color, and targeted color adjustments, reducing optimization barriers and parameter redundancy while improving generalization.
- To address limited data, the authors introduce AetherRetouch-1M+, a million-scale dataset for professional retouching created via a new inverse degradation workflow, and they add DAPO-AE, a reinforcement learning post-training method for better autonomous aesthetic cognition.
- Experiments reportedly show state-of-the-art results on multiple benchmarks with a much smaller model footprint, supporting mobile deployment, and the code/models are released on GitHub.
Related Articles

Why Autonomous Coding Agents Keep Failing — And What Actually Works
Dev.to

Text-to-image is easy. Chaining LLMs to generate, critique, and iterate on images autonomously is a routing nightmare. AgentSwarms now supports Image generation playground and creative media workflows!
Reddit r/artificial

Announcing the NVIDIA Nemotron 3 Super Build Contest
Dev.to

75% of Sites Blocking AI Bots Still Get Cited. Here Is Why Blocking Does Not Work.
Dev.to

Automating FDA Compliance: AI for Specialty Food Producers
Dev.to