Enhancing Self-Supervised Talking Head Forgery Detection via a Training-Free Dual-System Framework

arXiv cs.CV / 5/6/2026

📰 NewsModels & Research

共有:

Key Points

Supervised talking head forgery detectors struggle to generalize as generators evolve, so the paper focuses on self-supervised approaches for better cross-generator robustness.
It argues that current score-based self-supervised detectors do not fully exploit their discriminative power, especially on hard cases where anomaly ordering can be unreliable.
The authors propose a Training-Free Dual-System (TFDS) framework that first uses lightweight, threshold-based routing to separate confident vs. uncertain samples.
System-2 then re-examines only the uncertain subset with evidence-guided, fine-grained reasoning to correct the relative ordering of ambiguous cases, yielding consistent improvements across datasets and perturbation settings.
The improvements primarily come from better anomaly ordering within the uncertain subset, suggesting existing detectors already contain useful cues that can be unlocked without additional training.

Abstract

Supervised talking head forgery detection faces severe generalization challenges due to the continuous evolution of generators. By reducing reliance on generator-specific forgery patterns, self-supervised detectors offer stronger cross-generator robustness. However, existing research has mainly focused on building stronger detectors, while the discriminative capacity of trained detectors remains insufficiently exploited. In particular, for score-based self-supervised detectors, the limited discriminative ability on hard cases is often reflected in unreliable anomaly ordering, leaving room for further refinement. Motivated by this observation, we draw inspiration from the dual-system theory of human cognition and propose a Training-Free Dual-System (TFDS) framework to further exploit the latent discriminative capacity of existing score-based self-supervised detectors. TFDS treats anomaly-like scores as the basis of System-1, using lightweight threshold-based routing to partition samples into confident and uncertain subsets. System-2 then revisits only the uncertain subset, performing fine-grained evidence-guided reasoning to refine the relative ordering of ambiguous samples within the original score distribution. Extensive experiments demonstrate consistent improvements across datasets and perturbation settings, with the gains arising mainly from corrected ordering within the uncertain subset. These findings show that existing self-supervised talking head forgery detectors still contain underexploited discriminative cues that can be effectively unlocked through training-free dual-system reasoning.

Vibe coding and agentic engineering are getting closer than I'd like

Simon Willison's Blog

AI Harness Engineering: The Missing Layer Behind Reliable LLM Applications

Dev.to

An Open Benchmark for Testing RAG on Realistic Company-Internal Data

Reddit r/LocalLLaMA

Google and Meta race to build personal AI agents as Anthropic and OpenAI pull further ahead

THE DECODER

Ran K2.6 through a third-party coding benchmark: heres how the figures stand up

Reddit r/LocalLLaMA

Enhancing Self-Supervised Talking Head Forgery Detection via a Training-Free Dual-System Framework

Key Points

Abstract

Related Articles

Vibe coding and agentic engineering are getting closer than I'd like

AI Harness Engineering: The Missing Layer Behind Reliable LLM Applications

An Open Benchmark for Testing RAG on Realistic Company-Internal Data

Google and Meta race to build personal AI agents as Anthropic and OpenAI pull further ahead

Ran K2.6 through a third-party coding benchmark: heres how the figures stand up

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer