FED-Bench: A Cross-Granular Benchmark for Disentangled Evaluation of Facial Expression Editing
arXiv cs.CV / 4/1/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- FED-Bench is proposed as a new facial expression image-editing benchmark designed to support fine-grained control while preserving identity and background, addressing limitations in prior benchmarks.
- The benchmark includes 747 original–instruction–ground-truth triplets built via a cascaded, scalable pipeline, enabling more rigorous and instruction-accurate evaluation.
- A new evaluation protocol called FED-Score separates scoring into three dimensions—Alignment (instruction following), Fidelity (image quality and identity preservation), and Relative Expression Gain (expression change magnitude)—to reduce systemic metric biases.
- Experiments across 18 editing models show they typically cannot achieve high fidelity and accurate expression manipulation simultaneously, with fine-grained instruction following identified as the main bottleneck.
- The authors also plan to release code and provide a 20k+ in-the-wild training set for facial expression editing, demonstrating that fine-tuning a baseline model can yield significant gains.
Related Articles

Black Hat Asia
AI Business

Show HN: 1-Bit Bonsai, the First Commercially Viable 1-Bit LLMs
Dev.to

I Built an AI Agent That Can Write Its Own Tools When It Gets Stuck
Dev.to

How to Create AI Videos in 20 Minutes (3 Free Tools, Zero Experience)
Dev.to

Agent Self-Discovery: How AI Agents Find Their Own Wallets
Dev.to