Towards Design Compositing
arXiv cs.CV / 4/17/2026
📰 NewsTools & Practical UsageModels & Research
Key Points
- The paper argues that modern graphic design generation often assumes the provided image/text/logo inputs are already stylistically harmonious, which breaks down when assets come from mismatched sources.
- It proposes GIST, a training-free, identity-preserving image compositing module positioned between layout prediction and typography (design text) generation.
- GIST can be plugged into existing components-to-design or design-refining pipelines without modifying them, aiming to improve harmony by stylizing/compositing inputs rather than keeping them unchanged.
- Experiments integrating GIST with LaDeCo and Design-o-meter show improved visual harmony and aesthetic quality, validated by LLaVA-OV and GPT-4V using aspect-wise ratings and pairwise preferences versus naive pasting.



