AU Codes, Language, and Synthesis: Translating Anatomy to Text for Facial Behavior Synthesis
arXiv cs.CV / 3/20/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper identifies limitations of current AU-based text-to-face methods that encode AUs as one-hot vectors, which struggle with conflicting AUs and can produce anatomically implausible artifacts.
- It proposes describing facial action units with natural language to preserve expressive richness and allow explicit modeling of complex and conflicting expressions.
- The authors introduce BP4D-AUText, a large-scale text-image paired dataset created by applying a Dynamic AU Text Processor to the BP4D and BP4D+ datasets.
- They also present VQ-AUFace, a generative model that leverages facial structural priors to synthesize realistic and diverse facial behaviors from text, achieving superior performance in plausibility and perceptual realism, especially under conflicting AUs.
Related Articles
Automating the Chase: AI for Festival Vendor Compliance
Dev.to
MCP Skills vs MCP Tools: The Right Way to Configure Your Server
Dev.to
500 AI Prompts Every Content Creator Needs in 2026 (20 Free Samples)
Dev.to
Building a Game for My Daughter with AI — Part 1: What If She Could Build It Too?
Dev.to

Math needs thinking time, everyday knowledge needs memory, and a new Transformer architecture aims to deliver both
THE DECODER