ArmSSL: Adversarial Robust Black-Box Watermarking for Self-Supervised Learning Pre-trained Encoders
arXiv cs.AI / 4/27/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper proposes ArmSSL, a watermarking framework for self-supervised learning (SSL) pre-trained encoders that targets both black-box ownership verification and adversarial robustness.
- For black-box verification, ArmSSL uses a paired discrepancy enlargement method to enforce orthogonality in feature space between clean and watermark counterparts, producing a reliable verification signal even when the stolen encoder is accessed as a suspect black box.
- To resist adversarial watermark detection or removal, ArmSSL avoids watermark out-of-distribution (OOD) clustering by combining latent representation entanglement and distribution alignment so watermark features resemble natural in-distribution samples.
- The approach includes a reference-guided watermark tuning strategy that learns the watermark as a small side task while preserving downstream utility by matching the watermarked encoder’s outputs to the clean encoder’s outputs on normal data.
- Experiments across five SSL frameworks and nine benchmark datasets show ArmSSL provides better ownership verification with negligible utility loss and strong robustness versus state-of-the-art adversarial detection and removal methods.
Related Articles

Subagents: The Building Block of Agentic AI
Dev.to

DeepSeek-V4 Models Could Change Global AI Race
AI Business

Got OpenAI's privacy filter model running on-device via ExecuTorch
Reddit r/LocalLLaMA

The Agent-Skill Illusion: Why Prompt-Based Control Fails in Multi-Agent Business Consulting Systems
Dev.to

We Built a Voice AI Receptionist in 8 Weeks — Every Decision We Made and Why
Dev.to