AI Navigate

SwiftTailor: Efficient 3D Garment Generation with Geometry Image Representation

arXiv cs.CV / 3/20/2026

📰 NewsTools & Practical UsageModels & Research

Key Points

  • SwiftTailor presents a two-stage framework that unifies sewing-pattern reasoning with geometry-based mesh synthesis through a compact Garment Geometry Image representation.
  • It introduces PatternMaker to predict sewing patterns from diverse inputs and GarmentSewer to convert these patterns into a Garment Geometry Image encoding the 3D garment surface in a unified UV space.
  • The final 3D mesh is reconstructed via an efficient inverse mapping that leverages remeshing and dynamic stitching to amortize the cost of physical simulation.
  • Evaluations on the Multimodal GarmentCodeData show state-of-the-art accuracy and visual fidelity while significantly reducing inference time compared with prior methods (which ranged from 30 seconds to a minute).

Abstract

Realistic and efficient 3D garment generation remains a longstanding challenge in computer vision and digital fashion. Existing methods typically rely on large vision- language models to produce serialized representations of 2D sewing patterns, which are then transformed into simulation-ready 3D meshes using garment modeling framework such as GarmentCode. Although these approaches yield high-quality results, they often suffer from slow inference times, ranging from 30 seconds to a minute. In this work, we introduce SwiftTailor, a novel two-stage framework that unifies sewing-pattern reasoning and geometry-based mesh synthesis through a compact geometry image representation. SwiftTailor comprises two lightweight modules: PatternMaker, an efficient vision-language model that predicts sewing patterns from diverse input modalities, and GarmentSewer, an efficient dense prediction transformer that converts these patterns into a novel Garment Geometry Image, encoding the 3D surface of all garment panels in a unified UV space. The final 3D mesh is reconstructed through an efficient inverse mapping process that incorporates remeshing and dynamic stitching algorithms to directly assemble the garment, thereby amortizing the cost of physical simulation. Extensive experiments on the Multimodal GarmentCodeData demonstrate that SwiftTailor achieves state-of-the-art accuracy and visual fidelity while significantly reducing inference time. This work offers a scalable, interpretable, and high-performance solution for next-generation 3D garment generation.