Hitem3D 2.0: Multi-View Guided Native 3D Texture Generation

arXiv cs.CV / 4/13/2026

📰 NewsSignals & Early TrendsModels & Research

Key Points

  • Hitem3D 2.0 is introduced as a multi-view guided framework for native 3D texture generation aimed at fixing incomplete coverage, cross-view inconsistency, and geometry-texture misalignment.
  • The approach combines (1) a multi-view synthesis stage using a pre-trained image editing backbone plus plug-and-play modules for geometric alignment, cross-view consistency, and illumination uniformity.
  • It then uses a native 3D texture generation model that projects the synthesized multi-view textures onto 3D surfaces and plausibly fills unseen regions.
  • By integrating multi-view consistency constraints with native 3D texture modeling, the method improves texture completeness, coherence across views, and alignment with geometry.
  • Experiments report that Hitem3D 2.0 outperforms prior methods across texture detail, fidelity, consistency, coherence, and geometric alignment metrics.

Abstract

Although recent advances have improved the quality of 3D texture generation, existing methods still struggle with incomplete texture coverage, cross-view inconsistency, and misalignment between geometry and texture. To address these limitations, we propose Hitem3D 2.0, a multi-view guided native 3D texture generation framework that enhances texture quality through the integration of 2D multi-view generation priors and native 3D texture representations. Hitem3D 2.0 comprises two key components: a multi-view synthesis framework and a native 3D texture generation model. The multi-view generation is built upon a pre-trained image editing backbone and incorporates plug-and-play modules that explicitly promote geometric alignment, cross-view consistency, and illumination uniformity, thereby enabling the synthesis of high-fidelity multi-view images. Conditioned on the generated views and 3D geometry, the native 3D texture generation model projects multi-view textures onto 3D surfaces while plausibly completing textures in unseen regions. Through the integration of multi-view consistency constraints with native 3D texture modeling, Hitem3D 2.0 significantly improves texture completeness, cross-view coherence, and geometric alignment. Experimental results demonstrate that Hitem3D 2.0 outperforms existing methods in terms of texture detail, fidelity, consistency, coherence, and alignment.