Learning Illumination Control in Diffusion Models

arXiv cs.LG / 4/29/2026

💬 OpinionDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper introduces a fully open-source, reproducible pipeline to learn illumination control within diffusion image generation models.
  • It builds a “data engine” that converts well-lit images into supervised training triplets: a poorly lit input, a natural-language lighting instruction, and a well-lit target output.
  • The authors fine-tune diffusion models on this dataset and report significant gains over baseline SD 1.5, SDXL, and FLUX.1-dev models.
  • Improvements are evaluated using perceptual similarity, structural similarity, and identity preservation metrics.
  • The release includes all code, data, and model weights, enabling other researchers to reproduce and build upon the method.

Abstract

Controlling illumination in images is essential for photography and visual content creation. While closed-source models have demonstrated impressive illumination control, open-source alternatives either require heavy control inputs like depth maps or do not release their data and code. We present a fully open-source and reproducible pipeline for learning illumination control in diffusion models. Our approach builds a data engine that transforms well-lit images into supervised training triplets consisting of a poorly-illuminated input image, a natural language lighting instruction, and a well-illuminated output image. We finetune a diffusion model on this data and demonstrate significant improvements over baseline SD 1.5, SDXL, and FLUX.1-dev models in perceptual similarity, structural similarity, and identity preservation. Our work provides a reproducible solution built entirely with open-source tools and publicly available data. We release all our code, data, and model weights publicly.