Generative Phomosaic with Structure-Aligned and Personalized Diffusion

arXiv cs.CV / 4/9/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes what it claims is the first generative approach to photomosaic creation, replacing tile-by-tile color matching with diffusion-based synthesis.
  • It uses reference-image conditioning with a low-frequency mechanism to align global structure while still allowing prompt-driven detail variation.
  • The method aims to improve both semantic expressiveness and structural coherence compared with traditional, matching-based photomosaics that require many tile images.
  • It introduces few-shot personalized diffusion so users can generate stylistically consistent or user-specific tiles without needing a large image collection.

Abstract

We present the first generative approach to photomosaic creation. Traditional photomosaic methods rely on a large number of tile images and color-based matching, which limits both diversity and structural consistency. Our generative photomosaic framework synthesizes tile images using diffusion-based generation conditioned on reference images. A low-frequency conditioned diffusion mechanism aligns global structure while preserving prompt-driven details. This generative formulation enables photomosaic composition that is both semantically expressive and structurally coherent, effectively overcoming the fundamental limitations of matching-based approaches. By leveraging few-shot personalized diffusion, our model is able to produce user-specific or stylistically consistent tiles without requiring an extensive collection of images.