Leveraging Avatar Fingerprinting: A Multi-Generator Photorealistic Talking-Head Public Database and Benchmark

arXiv cs.CV / 3/31/2026

📰 NewsIdeas & Deep AnalysisTools & Practical UsageModels & Research

Key Points

  • The paper introduces AVAPrintDB, a new public talking-head avatar dataset built with multiple state-of-the-art avatar generators to support avatar fingerprinting research in more realistic impersonation settings.
  • AVAPrintDB includes both self- and cross-reenactment samples, aiming to simulate legitimate usage and identity impersonation scenarios that existing datasets do not cover well.
  • The authors define a standardized, reproducible benchmark for avatar fingerprinting and evaluate public state-of-the-art methods as well as approaches using foundation models like DINOv2 and CLIP.
  • Results indicate that identity-relevant motion cues can persist across synthetic avatars, but current fingerprinting systems are still highly sensitive to generator/synthesis pipeline changes and dataset/source shifts.
  • The dataset, benchmark protocols, and fingerprinting systems are released publicly to enable reproducible research and better study of robustness under domain and generator shift.

Abstract

Recent advances in photorealistic avatar generation have enabled highly realistic talking-head avatars, raising security concerns regarding identity impersonation in AI-mediated communication. To advance in this challenging problem, the task of avatar fingerprinting aims to determine whether two avatar videos are driven by the same human operator or not. However, current public databases in the literature are scarce and based solely on old-fashioned talking-head avatar generators, not representing realistic scenarios for the current task of avatar fingerprinting. To overcome this situation, the present article introduces AVAPrintDB, a new publicly available multi-generator talking-head avatar database for avatar fingerprinting. AVAPrintDB is constructed from two audiovisual corpora and three state-of-the-art avatar generators (GAGAvatar, LivePortrait, HunyuanPortrait), representing different synthesis paradigms, and includes both self- and cross-reenactments to simulate legitimate usage and impersonation scenarios. Building on this database, we also define a standardized and reproducible benchmark for avatar fingerprinting, considering public state-of-the-art avatar fingerprinting systems and exploring novel methods based on Foundation Models (DINOv2 and CLIP). Also, we conduct a comprehensive analysis under generator and dataset shift. Our results show that, while identity-related motion cues persist across synthetic avatars, current avatar fingerprinting systems remain highly sensitive to changes in the synthesis pipeline and source domain. The AVAPrintDB, benchmark protocols, and avatar fingerprinting systems are publicly available to facilitate reproducible research.