Strips as Tokens: Artist Mesh Generation with Native UV Segmentation

arXiv cs.CV / 4/13/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces “Strips as Tokens (SATO),” a new token-ordering framework for autoregressive transformers that generates artist-quality 3D meshes.
  • Unlike prior approaches that rely on coordinate sorting or patch heuristics, SATO uses triangle-strip-inspired connected face chains that explicitly encode UV boundaries to preserve edge flow and structural regularity.
  • SATO employs a unified token representation that can be decoded into either triangle or quadrilateral meshes, enabling flexible output formats from the same sequence.
  • The method is designed for joint training: large triangle datasets provide baseline structural priors, while high-quality quad datasets improve geometric regularity.
  • Experimental results reported by the authors indicate SATO outperforms existing methods across geometric quality, structural coherence, and UV segmentation accuracy.

Abstract

Recent advancements in autoregressive transformers have demonstrated remarkable potential for generating artist-quality meshes. However, the token ordering strategies employed by existing methods typically fail to meet professional artist standards, where coordinate-based sorting yields inefficiently long sequences, and patch-based heuristics disrupt the continuous edge flow and structural regularity essential for high-quality modeling. To address these limitations, we propose Strips as Tokens (SATO), a novel framework with a token ordering strategy inspired by triangle strips. By constructing the sequence as a connected chain of faces that explicitly encodes UV boundaries, our method naturally preserves the organized edge flow and semantic layout characteristic of artist-created meshes. A key advantage of this formulation is its unified representation, enabling the same token sequence to be decoded into either a triangle or quadrilateral mesh. This flexibility facilitates joint training on both data types: large-scale triangle data provides fundamental structural priors, while high-quality quad data enhances the geometric regularity of the outputs. Extensive experiments demonstrate that SATO consistently outperforms prior methods in terms of geometric quality, structural coherence, and UV segmentation.