DRG-Font: Dynamic Reference-Guided Few-shot Font Generation via Contrastive Style-Content Disentanglement

arXiv cs.CV / 4/16/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces DRG-Font, a dynamic reference-guided approach to few-shot font generation that targets better preservation of local glyph characteristics from limited references.
  • It improves style capture by contrastively learning style and content representations via embedding-space disentanglement, separating style/shape priors into distinct components.
  • A Reference Selection (RS) module is proposed to dynamically choose the most suitable style reference from a candidate pool for more effective style supervision.
  • The architecture uses multi-scale style/content head blocks (MSHB/MCHB) and a multi-fusion upsampling block (MFUB) to fuse the selected style prior with the target content prior for generating the final glyph.
  • The authors report significant performance gains over existing state-of-the-art methods across multiple visual and analytical benchmarks.

Abstract

Few-shot Font Generation aims to generate stylistically consistent glyphs from a few reference glyphs. However, capturing complex font styles from a few exemplars remains challenging, and the existing methods often struggle to retain discernible local characteristics in generated samples. This paper introduces DRG-Font, a contrastive font generation strategy that learns complex glyph attributes by decomposing style and content embedding spaces. For optimal style supervision, the proposed architecture incorporates a Reference Selection (RS) Module to dynamically select the best style reference from an available pool of candidates. The network learns to decompose glyph attributes into style and shape priors through a Multi-scale Style Head Block (MSHB) and a Multi-scale Content Head Block (MCHB). For style adaptation, a Multi-Fusion Upsampling Block (MFUB) produces the target glyph by combining the reference style prior and target content prior. The proposed method demonstrates significant improvements over state-of-the-art approaches across multiple visual and analytical benchmarks.