MeshLAM: Feed-Forward One-Shot Animatable Textured Mesh Avatar Reconstruction

arXiv cs.CV / 4/28/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

MeshLAM is a feed-forward framework that reconstructs a high-fidelity, animatable 3D textured head avatar from a single image in one forward pass.
The method avoids prior approaches’ heavy test-time optimization and multi-view requirements by using a dual shape/texture map architecture driven by a shared transformer backbone.
MeshLAM introduces an iterative GRU-based decoder with progressive geometry deformation and texture refinement to prevent mesh collapse and maintain topological integrity during deformation.
It also uses a reprojection-based texture guidance mechanism to anchor appearance learning to the input image, improving coherence of the reconstructed textures.
Experiments on reconstruction quality, animation capability, and computational efficiency indicate MeshLAM outperforms existing state-of-the-art methods.

Abstract

We introduce MeshLAM, a feed-forward framework for one-shot animatable mesh head reconstruction that generates high-fidelity, animatable 3D head avatars from a single image. Unlike previous work that relies on time-consuming test-time optimization or extensive multi-view data, our method produces complete mesh representations with inherent animatability from a single image in a single forward pass. Our approach employs a dual shape and texture map architecture that simultaneously processes mesh vertices and texture map with extracted image features from a shared transformer backbone, allowing for coherent shape carving and appearance modeling. To prevent mesh collapse and ensure topological integrity during feed-forward deformation, we propose an iterative GRU-based decoding mechanism with progressive geometry deformation and texture refinement, coupled with a novel reprojection-based texture guidance mechanism that anchors appearance learning to the input image. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches in reconstruction quality, animation capability, and computational efficiency. Project page at https://meshlam.github.io.