MD-Face: MoE-Enhanced Label-Free Disentangled Representation for Interactive Facial Attribute Editing

arXiv cs.CV / 4/23/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces MD-Face, a label-free method to learn disentangled facial representations for more reliable GAN-based attribute editing without unintended attribute changes.
MD-Face uses a Mixture of Experts (MoE) backbone with a gating mechanism to assign experts dynamically, aiming to learn more independent semantic vectors.
To reduce attribute entanglement further, it proposes a geometry-aware loss that aligns each semantic vector with a corresponding Semantic Boundary Vector (SBV) using a Jacobian-based pushforward approach.
Experiments on ProGAN and StyleGAN indicate MD-Face outperforms unsupervised baselines and is competitive with supervised disentanglement methods.
Compared with diffusion-based editing methods, the approach reports better image quality and lower inference latency, supporting interactive facial editing use cases.

Abstract

GAN-based facial attribute editing is widely used in virtual avatars and social media but often suffers from attribute entanglement, where modifying one face attribute unintentionally alters others. While supervised disentangled representation learning can address this, it relies heavily on labeled data, incurring high annotation costs. To address these challenges, we propose MD-Face, a label-free disentangled representation learning framework based on Mixture of Experts (MoE). MD-Face utilizes a MoE backbone with a gating mechanism that dynamically allocates experts, enabling the model to learn semantic vectors with greater independence. To further enhance attribute entanglement, we introduce a geometry-aware loss, which aligns each semantic vector with its corresponding Semantic Boundary Vector (SBV) through a Jacobian-based pushforward method. Experiments with ProGAN and StyleGAN show that MD-Face outperforms unsupervised baselines and competes with supervised ones. Compared to diffusion-based methods, it offers better image quality and lower inference latency, making it ideal for interactive editing.

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans

Dev.to

10 AI Tools Every Developer Should Try in 2026

Dev.to

Why use an AI gateway at all?

Dev.to

OpenAI Just Named It Workspace Agents. We Open-Sourced Our Lark Version Six Months Ago

Dev.to

MD-Face: MoE-Enhanced Label-Free Disentangled Representation for Interactive Facial Attribute Editing

Key Points

Abstract

Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans

10 AI Tools Every Developer Should Try in 2026

Why use an AI gateway at all?

OpenAI Just Named It Workspace Agents. We Open-Sourced Our Lark Version Six Months Ago

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer