AI Navigate

SERUM: Simple, Efficient, Robust, and Unifying Marking for Diffusion-based Image Generation

arXiv cs.CV / 3/17/2026

📰 NewsTools & Practical UsageModels & Research

Key Points

  • SERUM introduces a simple method: add a unique watermark noise to the initial diffusion generation noise and train a lightweight detector to identify watermarked images.
  • It aims to be robust against image augmentations and watermark removal attacks while preserving image quality and being computationally efficient.
  • The approach achieves high detection performance, with high true positive rate at a 1% false positive rate in most scenarios, and fast injection and detection with low detector training overhead.
  • Its decoupled architecture enables multiple users to embed individualized watermarks with minimal interference between marks.
  • It provides a practical solution to mark outputs from diffusion models and reliably distinguish generated from natural images.

Abstract

We propose SERUM: an intriguingly simple yet highly effective method for marking images generated by diffusion models (DMs). We only add a unique watermark noise to the initial diffusion generation noise and train a lightweight detector to identify watermarked images, simplifying and unifying the strengths of prior approaches. SERUM provides robustness against any image augmentations or watermark removal attacks and is extremely efficient, all while maintaining negligible impact on image quality. In contrast to prior approaches, which are often only resilient to limited perturbations and incur significant training, injection, and detection costs, our SERUM achieves remarkable performance, with the highest true positive rate (TPR) at a 1% false positive rate (FPR) in most scenarios, along with fast injection and detection and low detector training overhead. Its decoupled architecture also seamlessly supports multiple users by embedding individualized watermarks with little interference between the marks. Overall, our method provides a practical solution to mark outputs from DMs and to reliably distinguish generated from natural images.