krita-ai-diffusion icon indicating copy to clipboard operation
krita-ai-diffusion copied to clipboard

Added basic support for Z-Image diffusion models

Open Danamir opened this issue 2 months ago • 1 comments

Here is a basic support for Z-Image diffusion models, using the workflow and models found on https://comfyanonymous.github.io/ComfyUI_examples/z_image/ .

The most notable change compared to the other models is the use of qwen_3_4b text encoder, strangely used with the type Lumina2 in the clip loader node... And either the VAE flux ae.safetensors or their own VAE found here : https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/tree/main/vae (did not see any difference).

Other than that the workflow is pretty straightforward.

I don't know if you want to add the Arch.zimage to the plugin, but in any case, here it is !

Regarding the model in itself, it's pretty good. Relatively fast and lightweight, and with good prompt comprehension. The images are a little bit noisy, but it gives a good counterpoint to the sometimes too neat Qwen. I'm going to try using it as a refiner pass.

Cheers !

[Edit] : Since the only model released for now is a "Turbo" version, it must be used with a custom style with 1.0 CFG, and around 8 steps. Increasing the CFG around 2.0 can improve the sharpness, at the cost of doubling the processing time.

Danamir avatar Nov 27 '25 16:11 Danamir

I forgot to mention, the arch detection is a little bit wonky. The ComfyUI returns unknown as model type for now, so I based it on the model filename only. To be updated later.

Danamir avatar Nov 27 '25 16:11 Danamir

Looks good!

GitHub says there's conflicts, maybe have to rebase?

Acly avatar Nov 28 '25 19:11 Acly

I'll do the detection update and the rebase.

Danamir avatar Nov 28 '25 21:11 Danamir

the full model should be releasing soon ™️ , would be nice to keep that in mind so we can just plug and play. though we don't really know what settings that model will prefer.

rktvr avatar Nov 29 '25 00:11 rktvr

the full model should be releasing soon ™️ , would be nice to keep that in mind so we can just plug and play. though we don't really know what settings that model will prefer.

This PR only address the most basic detection so that the Z-Image model Architecture is correctly detected, and the appropriate nodes are used. I did not go into any style specifics for Turbo or not. So it should work out-of-the box for the standard model too.

That being said, there will be a need for a new Arch for the edit variant.

Danamir avatar Nov 29 '25 01:11 Danamir

Yea we can make a new PR if changes are needed for Base/Edit model. Will wait couple of days with a release in case they come out.

Acly avatar Nov 29 '25 19:11 Acly