Most local AI image tools give you text-to-image and call it done. You type a prompt, get an image, and if it's not what you wanted — you start over with a different prompt. That's fine for exploration, but it's a terrible workflow when you have a specific result in mind.
Image-to-Image (I2I) changes that. You start with a reference image — a photo, a sketch, a previous generation — and tell the model what to change. Keep the composition, adjust the style. Keep the pose, change the outfit. Keep the layout, make it photorealistic. The source image anchors the generation so you're refining instead of rolling dice.
We added I2I to Locally Uncensored in v2.3.0, and it works with every image model the app supports. Here's how it works and which models to use for what.
How Image-to-Image Works
The core mechanic is denoise strength — a value between 0.0 and 1.0 that controls how much the model changes your source image.
- 0.1–0.3: Subtle adjustments. Color grading, minor style shifts, texture changes. The original image is clearly recognizable.
- 0.4–0.6: Moderate transformation. The composition and major shapes stay, but details, colors, and style can change significantly. This is the sweet spot for most use cases.
- 0.7–0.9: Heavy reimagining. The model uses your image as a loose guide but generates most of the content from scratch based on your prompt.
- 1.0: Effectively text-to-image with the same dimensions. The source image has no influence.
In Locally Uncensored, you drag and drop (or paste) a source image, set the denoise slider, write your prompt, and hit Generate. The app handles the ComfyUI workflow construction automatically — no node editing required.
Which Models to Use
SDXL Models (Juggernaut XL, RealVisXL, DreamShaper XL)
Best for: Photorealistic transformations, portrait refinement, product photography.
SDXL models at 6 GB VRAM are the most accessible option. Juggernaut XL V9 is particularly strong for photorealistic I2I — it preserves facial structure well and handles skin textures naturally. DreamShaper XL leans more artistic if that's what you're after.
Typical workflow: Take a phone photo, upload it, set denoise to 0.35–0.50, prompt with the style you want. "professional headshot, studio lighting, shallow depth of field" turns a casual selfie into something usable.
VRAM: 6–8 GB
FLUX Models (FLUX.1 Schnell, FLUX.1 Dev, FLUX 2 Klein)
Best for: Text rendering in images, complex scene modifications, architectural visualization.
FLUX handles text-in-image better than any other local model. If your I2I task involves changing text on a sign, modifying UI mockups, or generating images where readable text matters — FLUX is the answer.
FLUX 2 Klein is the newest and fastest. FLUX.1 Dev gives higher quality but takes longer. FLUX.1 Schnell is the speed option.
Typical workflow: Screenshot a UI design, upload it, set denoise to 0.40–0.55, prompt with modifications. "dark mode version, rounded corners, blue accent color" — and the text stays readable.
VRAM: 8–12 GB
Z-Image (Turbo and Base)
Best for: Unrestricted content, fast iteration, creative exploration without safety filters.
Z-Image has zero content filtering. No prompt rejection, no safety classifiers. Whatever you describe, it generates. The Turbo variant does this in 8–15 seconds.
For I2I specifically, Z-Image Turbo is excellent for rapid iteration — the speed means you can try 10 variations in the time other models do 2. Set denoise low (0.2–0.35) for quick style transfers, or high (0.6+) for dramatic transformations.
Z-Image Base produces higher quality output but takes longer. Use Turbo for exploration, Base for final renders.
VRAM: 10–16 GB
Practical I2I Workflows
Style Transfer
Upload a photograph, set denoise to 0.45–0.55, prompt with an art style: "oil painting, impressionist style, warm palette." The composition stays, the medium changes.
Sketch to Render
Draw a rough sketch on paper or in any drawing app. Upload it. Set denoise to 0.65–0.80. The model uses your sketch as a structural guide and fills in realistic or stylized details based on your prompt.
Iterative Refinement
Generate a text-to-image result that's 80% right. Use it as I2I input with denoise 0.20–0.35 and a prompt focused on what you want fixed. Repeat until it's right. This is dramatically more efficient than regenerating from scratch each time.
Background Replacement
Upload a product photo or portrait. Set denoise to 0.40–0.50. Prompt with a new background description. The subject stays mostly intact while the environment changes. Works best with SDXL models that handle foreground/background separation well.
Setup
If you have Locally Uncensored installed, I2I is already there — no additional setup.
- Open the Create tab
- Click the image upload area (or drag and drop)
- Set the Denoise slider
- Write your prompt describing the desired output
- Hit Generate
If ComfyUI isn't installed yet, the app handles that with one click. Model bundles are available for one-click download — the app shows you which ones your GPU can actually run based on VRAM.
Source
Locally Uncensored is open source (AGPL-3.0):
- GitHub: PurpleDoubleD/locally-uncensored
- Website: locallyuncensored.com
I2I shipped in v2.3.0 alongside Image-to-Video (FramePack on 6 GB VRAM), ComfyUI plug & play, and one-click model bundles.


