SAiW: Source-Attributable Invisible Watermarking for Proactive Deepfake Defense

arXiv cs.AI / 3/25/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes SAiW, a proactive deepfake defense approach using source-attributable invisible watermarking to verify media provenance at the time of creation.
  • SAiW treats watermark embedding as a source-conditioned representation learning problem, using watermark identity to modulate embedding so signatures remain discriminative and traceable across multiple sources.
  • A perceptual guidance module based on human visual system priors is used to keep watermark perturbations visually imperceptible while preserving robustness.
  • A dual-purpose forensic decoder reconstructs the watermark and performs source attribution, aiming to provide both automated verification and interpretable forensic evidence.
  • Experiments across multiple deepfake datasets indicate strong robustness to common real-world transformations and attacks (compression, filtering, noise, geometric changes, and adversarial perturbations) while maintaining high perceptual quality.

Abstract

Deepfakes generated by modern generative models pose a serious threat to information integrity, digital identity, and public trust. Existing detection methods are largely reactive, attempting to identify manipulations after they occur and often failing to generalize across evolving generation techniques. This motivates the need for proactive mechanisms that secure media authenticity at the time of creation. In this work, we introduce SAiW, a Source-Attributed Invisible watermarking Framework for proactive deepfake defense and media provenance verification. Unlike conventional watermarking methods that treat watermark payloads as generic signals, SAiW formulates watermark embedding as a source-conditioned representation learning problem, where watermark identity encodes the originating source and modulates the embedding process to produce discriminative and traceable signatures. The framework integrates feature-wise linear modulation to inject source identity into the embedding network, enabling scalable multi-source watermark generation. A perceptual guidance module derived from human visual system priors ensures that watermark perturbations remain visually imperceptible while maintaining robustness. In addition, a dual-purpose forensic decoder simultaneously reconstructs the embedded watermark and performs source attribution, providing both automated verification and interpretable forensic evidence. Extensive experiments across multiple deepfake datasets demonstrate that SAiW achieves high perceptual quality while maintaining strong robustness against compression, filtering, noise, geometric transformations, and adversarial perturbations. By binding digital media to its origin through invisible yet verifiable markers, SAiW enables reliable authentication and source attribution, providing a scalable foundation for proactive deepfake defense and trustworthy media provenance.