Color Conditional Generation with Sliced Wasserstein Guidance

arXiv cs.CV / 5/4/2026

📰 NewsTools & Practical UsageModels & Research

Key Points

  • The paper introduces SW-Guidance, a training-free method for diffusion-based image generation that conditions output on the color distribution of a reference image.
  • Instead of applying color transfer after text-to-image generation (which can produce semantically meaningless colors), SW-Guidance directly modifies the diffusion sampling process using a differentiable Sliced 1-Wasserstein distance.
  • By incorporating the color-distribution distance between the generated image and the reference palette during sampling, the method improves color similarity while preserving semantic coherence with the text prompt.
  • The authors report that SW-Guidance outperforms existing state-of-the-art approaches for color-conditional generation.
  • The accompanying source code is provided to enable reproduction and further experimentation.

Abstract

We propose SW-Guidance, a training-free approach for image generation conditioned on the color distribution of a reference image. While it is possible to generate an image with fixed colors by first creating an image from a text prompt and then applying a color style transfer method, this approach often results in semantically meaningless colors in the generated image. Our method solves this problem by modifying the sampling process of a diffusion model to incorporate the differentiable Sliced 1-Wasserstein distance between the color distribution of the generated image and the reference palette. Our method outperforms state-of-the-art techniques for color-conditional generation in terms of color similarity to the reference, producing images that not only match the reference colors but also maintain semantic coherence with the original text prompt. Our source code is available at https://github.com/alobashev/sw-guidance/.