Synthetic Trust Attacks: Modeling How Generative AI Manipulates Human Decisions in Social Engineering Fraud

arXiv cs.AI / 4/8/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that the key threat from generative AI-driven scams is not synthetic-media detection but manipulating the victim’s decision-making through “synthetic trust.”
  • It introduces Synthetic Trust Attacks (STAs) as a formal threat category and proposes STAM, an eight-stage operational model covering the attacker’s full chain from reconnaissance to post-compliance leverage.
  • Using reported performance gaps (e.g., human deepfake detection around ~55.5% and higher compliance rates for LLM scam agents), the authors contend that the perception layer is already failing in many real-world scenarios.
  • The research provides a Trust-Cue Taxonomy, a reproducible incident coding schema, and four falsifiable hypotheses connecting attack structure to compliance outcomes.
  • As a decision-layer defense, it operationalizes the Calm, Check, Confirm protocol and reframes defenses toward improving human/organizational decision processes rather than only detecting fakes.

Abstract

Imagine receiving a video call from your CFO, surrounded by colleagues, asking you to urgently authorise a confidential transfer. You comply. Every person on that call was fake, and you just lost $25 million. This is not a hypothetical. It happened in Hong Kong in January 2024, and it is becoming the template for a new generation of fraud. AI has not invented a new crime. It has industrialised an ancient one: the manufacture of trust. This paper proposes Synthetic Trust Attacks (STAs) as a formal threat category and introduces STAM, the Synthetic Trust Attack Model, an eight-stage operational framework covering the full attack chain from adversary reconnaissance through post-compliance leverage. The core argument is this: existing defenses target synthetic media detection, but the real attack surface is the victim's decision. When human deepfake detection accuracy sits at approximately 55.5%, barely above chance, and LLM scam agents achieve 46% compliance versus 18% for human operators while evading safety filters entirely, the perception layer has already failed. Defense must move to the decision layer. We present a five-category Trust-Cue Taxonomy, a reproducible 17-field Incident Coding Schema with a pilot-coded example, and four falsifiable hypotheses linking attack structure to compliance outcomes. The paper further operationalizes the author's practitioner-developed Calm, Check, Confirm protocol as a research-grade decision-layer defense. Synthetic credibility, not synthetic media, is the true attack surface of the AI fraud era.