Unlearning the Unpromptable: Prompt-free Instance Unlearning in Diffusion Models
arXiv cs.LG / 3/12/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces prompt-free instance unlearning for diffusion models, aiming to forget undesired outputs that cannot be specified by text prompts, such as faces or culturally misinterpreted depictions.
- It proposes a surrogate-based unlearning method that combines image editing, timestep-aware weighting, and gradient surgery to guide models toward forgetting targeted outputs while preserving overall integrity.
- Experiments on conditional (Stable Diffusion 3) and unconditional (DDPM-CelebA) diffusion models demonstrate that the method uniquely unlearns unpromptable outputs and outperforms prompt-based and prompt-free baselines.
- The work suggests a practical hotfix approach for diffusion model providers to enhance privacy protection and ethical compliance.
Related Articles
[R] Combining Identity Anchors + Permission Hierarchies achieves 100% refusal in abliterated LLMs — system prompt only, no fine-tuning
Reddit r/MachineLearning
How I Built an AI SDR Agent That Finds Leads and Writes Personalized Cold Emails
Dev.to
Complete Guide: How To Make Money With Ai
Dev.to
I Analyzed My Portfolio with AI and Scored 53/100 — Here's How I Fixed It to 85+
Dev.to
The Demethylation
Dev.to