GenLCA: 3D Diffusion for Full-Body Avatars from In-the-Wild Videos
arXiv cs.CV / 4/9/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- GenLCA is a diffusion-based generative model that creates and edits photorealistic full-body 3D avatars from text and image inputs while keeping facial and full-body animations high-fidelity.
- The method trains a full-body 3D diffusion model from partially observable 2D video data by using a repurposed pretrained avatar reconstruction model as an animatable 3D tokenizer, scaling training to millions of real-world videos.
- Because real-world videos often contain only partial body observations, GenLCA introduces a visibility-aware diffusion training strategy that replaces invalid token regions with learnable tokens and applies losses only to valid regions to prevent blur/transparency artifacts.
- A flow-based diffusion model is trained on the resulting 3D token dataset, aiming to preserve the photorealism and animatability properties of the underlying reconstruction model while enabling native 3D learning.
- The authors report that GenLCA produces diverse, high-fidelity avatar generation and editing results and claims large performance improvements over existing approaches.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.
Related Articles

Black Hat Asia
AI Business

Meta Superintelligence Lab Releases Muse Spark: A Multimodal Reasoning Model With Thought Compression and Parallel Agents
MarkTechPost
Chatbots are great at manipulating people to buy stuff, Princeton boffins find
The Register
I tested and ranked every ai companion app I tried and here's my honest breakdown
Reddit r/artificial
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to