ResetEdit: Precise Text-guided Editing of Generated Image via Resettable Starting Latent
arXiv cs.CV / 4/29/2026
📰 NewsTools & Practical UsageModels & Research
Key Points
- The paper proposes ResetEdit, a diffusion-model editing framework for precise text-guided modifications that preserves global image structure while changing local regions.
- It argues that inversion-based methods (e.g., DDIM inversion) produce poor “starting latents,” hurting edit fidelity and causing structural inconsistencies during editing.
- ResetEdit addresses this by embedding recoverable latent information into the generation process, injecting the discrepancy between clean and diffused latents and later extracting it during inversion to reconstruct a resettable latent close to the true starting state.
- To further improve results, it introduces a lightweight latent optimization module that compensates for reconstruction bias arising from VAE asymmetry.
- Evaluated on Stable Diffusion, ResetEdit reportedly integrates with tuning-free editing methods and consistently beats prior baselines in controllability and visual quality.
