PartNerFace: Part-based Neural Radiance Fields for Animatable Facial Avatar Reconstruction

arXiv cs.CV / 4/16/2026

📰 NewsSignals & Early TrendsModels & Research

Key Points

  • The paper introduces PartNerFace, a part-based Neural Radiance Fields method to reconstruct animatable facial avatars from monocular RGB video inputs.
  • It argues that prior approaches either rely on morphable-model conditioning or learn a generic canonical field, leading to poor generalization to unseen facial expressions and limited fine motion capture.
  • PartNerFace improves reconstruction by using inverse skinning with a parametric head model to map observed points into canonical space, then applying fine-scale, part-specific deformation modeling.
  • The method uses multiple local MLPs with soft-weighting to adaptively partition the canonical space and aggregate part-wise deformation predictions for each 3D point.
  • Experiments report stronger quantitative and qualitative performance than state-of-the-art techniques, particularly for unseen expressions and detailed facial motion.

Abstract

We present PartNerFace, a part-based neural radiance fields approach, for reconstructing animatable facial avatar from monocular RGB videos. Existing solutions either simply condition the implicit network with the morphable model parameters or learn an imaginary canonical radiance field, making them fail to generalize to unseen facial expressions and capture fine-scale motion details. To address these challenges, we first apply inverse skinning based on a parametric head model to map an observed point to the canonical space, and then model fine-scale motions with a part-based deformation field. Our key insight is that the deformation of different facial parts should be modeled differently. Specifically, our part-based deformation field consists of multiple local MLPs to adaptively partition the canonical space into different parts, where the deformation of a 3D point is computed by aggregating the prediction of all local MLPs by a soft-weighting mechanism. Extensive experiments demonstrate that our method generalizes well to unseen expressions and is capable of modeling fine-scale facial motions, outperforming state-of-the-art methods both quantitatively and qualitatively.