Add support for Hunyuan Image 3.0
Nice! Will take a look 👍
Thank you, @yousef-rafat, you are a GigaChad! ~~(unlike you know who)~~
What are the VRAM/RAM requirements currently + speed per iteration? Does it support quantization?
At least 6 Gigabytes for the non-MoE part. The MoE part will depend on the amount of VRAM. The lower the VRAM, the longer the time takes to generate increases exponentially. I'm still experimenting with running the model on a single gpu, though, so no final numbers.
Will you implement some ncpu offload mechanism like in llama.cpp to only keep active parameters in vram and make the other offloadable to cpu/system ram? Not that you have to do it, support itself is amazing already, though if that is your plan, make sure to try and contact the guy from the multigpu node, he knows a lot of memory management in comfyui (;
@yousef-rafat How to playtest this, can you give some instructions and the workflow? Do you have a ComfyUI compatible checkpoint or it's raw model for now?
How do you convert the original Hunyuan image 3 split checkpoint into the loadable ComfyUI format?
@kabachuha Sorry for the late response, been busy. I usually start with testing the model in a comfyui format independently, just to make sure everything is working right as it should, before testing it in comfyui. This is the script I'm working with. test_.py
The checkpoint conversion isn't hard once I know how everything should go. I could update on the checkpoint loading part soon.