AU Codes, Language, and Synthesis: Translating Anatomy to Text for Facial Behavior Synthesis
arXiv cs.CV / 3/20/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper identifies limitations of current AU-based text-to-face methods that encode AUs as one-hot vectors, which struggle with conflicting AUs and can produce anatomically implausible artifacts.
- It proposes describing facial action units with natural language to preserve expressive richness and allow explicit modeling of complex and conflicting expressions.
- The authors introduce BP4D-AUText, a large-scale text-image paired dataset created by applying a Dynamic AU Text Processor to the BP4D and BP4D+ datasets.
- They also present VQ-AUFace, a generative model that leverages facial structural priors to synthesize realistic and diverse facial behaviors from text, achieving superior performance in plausibility and perceptual realism, especially under conflicting AUs.
Related Articles
ADICはどの種類の革新なのか ―― ドリフト監査デモで見る「事後説明」から「通過条件」への移行**
Qiita
Complete Guide: How To Make Money With Ai
Dev.to
Built a small free iOS app to reduce LLM answer uncertainty with multiple models
Dev.to
Without Valid Data, AI Transformation Is Flying Blind – Why We Need to “Grasp” Work Again
Dev.to
How We Used Hindsight Memory to Build an AI That Knows Your Weaknesses
Dev.to