Generalized Disguise Makeup Presentation Attack Detection Using an Attention-Guided Patch-Based Framework

arXiv cs.CV / 4/30/2026

📰 NewsSignals & Early TrendsModels & Research

Key Points

  • The paper addresses the challenge of disguise makeup presentation attacks that can fool facial recognition systems by using realistic cosmetics, prosthetics, and materials.
  • It proposes a generalized detection framework with a two-phase approach: a style-invariant full-face model generates region attention (via Grad-CAM) and a patch-based stage performs localized, region-specific analysis.
  • The method uses metric learning and a whitening transformation to improve discrimination and reduce sensitivity to stylistic variations in faces.
  • A new real-world dataset is introduced, containing live and disguise makeup faces with broad variation in subjects, environments, and disguise materials.
  • Experiments show strong cross-dataset generalization, reporting 8.97% ACER and 9.76% EER on the new dataset and very low error rates on SIW-Mv2 spoof categories.

Abstract

Despite significant advances in facial recognition systems, they remain vulnerable to face presentation attacks. Among them, disguise makeup attacks are particularly challenging, as they use advanced cosmetics, prosthetic components, and artificial materials to realistically alter facial appearance, often making detection difficult even for humans. Despite their importance, this problem remains underexplored, and publicly available datasets are limited. To address this, we propose a generalized disguise makeup presentation attack detection framework. The method adopts a two-phase design in which a style-invariant full-face model, trained with metric learning and enhanced by a whitening transformation, extracts region attention scores via Grad-CAM. These scores guide a patch-based phase that performs localized analysis using region-specific subnetworks trained with metric learning for fine-grained discrimination. We also construct a new, diverse dataset of live and disguise makeup faces collected under real-world conditions, covering variations in subjects, environments, and disguise materials. Experimental results demonstrate strong generalization across both the collected dataset and SIW-Mv2, achieving 8.97% ACER and 9.76% EER on the collected dataset, and 0% ACER on Obfuscation and Impersonation and 1.34% on Cosmetics attacks of SIW-Mv2. The proposed method consistently outperforms prior works while maintaining robust performance across other spoof types.