image-to-image with local AI models — which model for what, and how denoise strength actually works

Dev.to / 4/10/2026

💬 OpinionTools & Practical UsageModels & Research

共有:

Key Points

The article explains that image-to-image (I2I) uses a denoise strength slider (0.0–1.0) to control how strongly a local model changes a reference image while applying a text prompt.
It defines practical denoise ranges: 0.1–0.3 for subtle tweaks, 0.4–0.6 for the most useful balance of preserving composition while changing details, 0.7–0.9 for heavy reimagining, and 1.0 as effectively text-to-image with no source influence.
It notes that Locally Uncensored added I2I in v2.3.0 and that the app automatically constructs the underlying ComfyUI workflow without requiring node editing.
It provides guidance on model selection, highlighting SDXL models (e.g., Juggernaut XL, RealVisXL, DreamShaper XL) for photorealistic transformations and portrait/product refinement, with an emphasis on VRAM needs (~6–8 GB).
It introduces FLUX models (e.g., FLUX.1 Schnell/Dev and FLUX 2 Klein) as particularly strong for text rendering in images and complex scene changes, implying different strengths depending on the I2I goal.

Most local AI image tools give you text-to-image and call it done. You type a prompt, get an image, and if it's not what you wanted — you start over with a different prompt. That's fine for exploration, but it's a terrible workflow when you have a specific result in mind.

Image-to-Image (I2I) changes that. You start with a reference image — a photo, a sketch, a previous generation — and tell the model what to change. Keep the composition, adjust the style. Keep the pose, change the outfit. Keep the layout, make it photorealistic. The source image anchors the generation so you're refining instead of rolling dice.

We added I2I to Locally Uncensored in v2.3.0, and it works with every image model the app supports. Here's how it works and which models to use for what.

How Image-to-Image Works

The core mechanic is denoise strength — a value between 0.0 and 1.0 that controls how much the model changes your source image.

0.1–0.3: Subtle adjustments. Color grading, minor style shifts, texture changes. The original image is clearly recognizable.
0.4–0.6: Moderate transformation. The composition and major shapes stay, but details, colors, and style can change significantly. This is the sweet spot for most use cases.
0.7–0.9: Heavy reimagining. The model uses your image as a loose guide but generates most of the content from scratch based on your prompt.
1.0: Effectively text-to-image with the same dimensions. The source image has no influence.

In Locally Uncensored, you drag and drop (or paste) a source image, set the denoise slider, write your prompt, and hit Generate. The app handles the ComfyUI workflow construction automatically — no node editing required.

Which Models to Use

SDXL Models (Juggernaut XL, RealVisXL, DreamShaper XL)

Best for: Photorealistic transformations, portrait refinement, product photography.

SDXL models at 6 GB VRAM are the most accessible option. Juggernaut XL V9 is particularly strong for photorealistic I2I — it preserves facial structure well and handles skin textures naturally. DreamShaper XL leans more artistic if that's what you're after.

Typical workflow: Take a phone photo, upload it, set denoise to 0.35–0.50, prompt with the style you want. "professional headshot, studio lighting, shallow depth of field" turns a casual selfie into something usable.

VRAM: 6–8 GB

FLUX Models (FLUX.1 Schnell, FLUX.1 Dev, FLUX 2 Klein)

Best for: Text rendering in images, complex scene modifications, architectural visualization.

FLUX handles text-in-image better than any other local model. If your I2I task involves changing text on a sign, modifying UI mockups, or generating images where readable text matters — FLUX is the answer.

FLUX 2 Klein is the newest and fastest. FLUX.1 Dev gives higher quality but takes longer. FLUX.1 Schnell is the speed option.

Typical workflow: Screenshot a UI design, upload it, set denoise to 0.40–0.55, prompt with modifications. "dark mode version, rounded corners, blue accent color" — and the text stays readable.

VRAM: 8–12 GB

Z-Image (Turbo and Base)

Best for: Unrestricted content, fast iteration, creative exploration without safety filters.

Z-Image has zero content filtering. No prompt rejection, no safety classifiers. Whatever you describe, it generates. The Turbo variant does this in 8–15 seconds.

For I2I specifically, Z-Image Turbo is excellent for rapid iteration — the speed means you can try 10 variations in the time other models do 2. Set denoise low (0.2–0.35) for quick style transfers, or high (0.6+) for dramatic transformations.

Z-Image Base produces higher quality output but takes longer. Use Turbo for exploration, Base for final renders.

VRAM: 10–16 GB

Practical I2I Workflows

Style Transfer

Upload a photograph, set denoise to 0.45–0.55, prompt with an art style: "oil painting, impressionist style, warm palette." The composition stays, the medium changes.

Sketch to Render

Draw a rough sketch on paper or in any drawing app. Upload it. Set denoise to 0.65–0.80. The model uses your sketch as a structural guide and fills in realistic or stylized details based on your prompt.

Iterative Refinement

Generate a text-to-image result that's 80% right. Use it as I2I input with denoise 0.20–0.35 and a prompt focused on what you want fixed. Repeat until it's right. This is dramatically more efficient than regenerating from scratch each time.

Background Replacement

Upload a product photo or portrait. Set denoise to 0.40–0.50. Prompt with a new background description. The subject stays mostly intact while the environment changes. Works best with SDXL models that handle foreground/background separation well.

Setup

If you have Locally Uncensored installed, I2I is already there — no additional setup.

Open the Create tab
Click the image upload area (or drag and drop)
Set the Denoise slider
Write your prompt describing the desired output
Hit Generate

If ComfyUI isn't installed yet, the app handles that with one click. Model bundles are available for one-click download — the app shows you which ones your GPU can actually run based on VRAM.

Source

Locally Uncensored is open source (AGPL-3.0):

GitHub: PurpleDoubleD/locally-uncensored
Website: locallyuncensored.com

I2I shipped in v2.3.0 alongside Image-to-Video (FramePack on 6 GB VRAM), ComfyUI plug & play, and one-click model bundles.

Black Hat USA

AI Business

Black Hat Asia

AI Business

GLM 5.1 tops the code arena rankings for open models

Reddit r/LocalLLaMA

My Bestie Built a Free MCP Server for Job Search — Here's How It Works

Dev.to

can we talk about how AI has gotten really good at lying to you?

Reddit r/artificial