V.O.I.C.E (Voice, Ownership, Identity, Control, Expression): Risk Taxonomy of Synthetic Voice Generation From Empirical Data
arXiv cs.AI / 4/29/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that synthetic voice generation creates new privacy, security, and governance risks that existing threat models do not adequately capture.
- It introduces V.O.I.C.E, a risk taxonomy derived from multi-source threat modeling using 569 incidents from major AI incident databases plus FTC/IC3 data.
- The taxonomy is further grounded in 1,067 direct reports from U.S. participants (including voice actors, internet personalities, political staff, and the general public) and 2,221 Reddit discussions.
- V.O.I.C.E models not only what risks occur, but also how they emerge and how contextual factors—such as exposure level, social visibility, and availability of legal protections—affect the risk.
- The work aims to improve governance and defenses by providing a more empirically based framework for synthetic voice misuse scenarios.


