NumColor: Precise Numeric Color Control in Text-to-Image Generation
arXiv cs.CV / 3/17/2026
📰 NewsTools & Practical UsageModels & Research
Key Points
- The article identifies that diffusion models struggle with precise numeric colors because subword tokenization fragments color codes into meaningless tokens.
- NumColor introduces a Color Token Aggregator and a ColorBook containing 6,707 learnable embeddings that map colors into the text encoder's perceptually uniform CIE Lab space to enable accurate color control.
- It uses two auxiliary losses, directional alignment and interpolation consistency, to enforce a geometric mapping between Lab space and the embedding space, enabling smooth color interpolation.
- A synthetic dataset, NumColor-Data, with 500,000 images provides unambiguous color-to-pixel correspondence to train the ColorBook, avoiding annotation ambiguity from photographs.
- NumColor transfers zero-shot to multiple diffusion models (e.g., SD3, SD3.5, PixArt-α, PixArt-Σ) and delivers 4-9x improvements in numerical color accuracy and 10-30x improvements in color harmony on GenColorBench.
Related Articles
Astral to Join OpenAI
Dev.to

I Built a MITM Proxy to See What Claude Code Actually Sends to Anthropic
Dev.to

Your AI coding agent is installing vulnerable packages. I built the fix.
Dev.to
ChatGPT Prompt Engineering for Freelancers: Unlocking Efficient Client Communication
Dev.to

PearlOS. We gave swarm intelligence a local desktop environment and code control to self-evolve. Has been pretty incredible to see so far. Open source and free if you want your own.
Reddit r/LocalLLaMA