ComfyUI_ExtraModels Major overhaul / better native integration

The plan is to do a full rewrite/refactor of this repo to have better integration with most of the native comfy code. This should make things less fragile (and less cumbersome in general).

Progress/steps:

[x] Remove HyDiT (already supported in base ComfyUI)
[x] Rewrite PixArt base code
[x] Add T5 for PixArt
[x] Rewrite Sana base code
[ ] Add Gemma for Sana [mostly done, missing logic]
[ ] Rewrite DiT base code
[ ] Rewrite VAE loader
[ ] Add back LoRA support

Major changes:

Text encoders return comfy compatible CLIP objects instead of custom types
Auto config detection wherever possible to minimize user error
Single node with dropdown for all models instead of separate node sets per model
PixArt/DiT/etc models are loaded from unet (diffusion_models) folder instead of checkpoints folder

Other possible ideas/plans:

GGUF support if ComfyUI-GGUF is installed
Generic resolution select node with dropdown and slider/float input
Change to native comfy attention for PixArt/Sana
Add proper ControlNet for PixArt (current version never worked correctly)

Dec 11 '24 02:12 city96

If you want to test out the diffusers format of SANA, I just finetuned this model. https://huggingface.co/frutiemax/themoviedb_1600M_1024px

Dec 26 '24 19:12 frutiemax92

If you want to test out the diffusers format of SANA, I just finetuned this model. https://huggingface.co/frutiemax/themoviedb_1600M_1024px

gives the error :

# ComfyUI Error Report
## Error Details
- **Node ID:** 181
- **Node Type:** EXMUnetLoader
- **Exception Type:** KeyError
- **Exception Message:** 'hidden_size'
## Stack Trace
  File "D:\c2\execution.py", line 327, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "D:\c2\execution.py", line 202, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "D:\c2\execution.py", line 174, in _map_node_over_list
    process_inputs(input_dict, i)
  File "D:\c2\execution.py", line 163, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "D:\c2\custom_nodes\ComfyUI_ExtraModels\nodes.py", line 33, in load_unet
    return (loader_fn(sd),)
  File "D:\c2\custom_nodes\ComfyUI_ExtraModels\Sana\loader.py", line 57, in load_sana_state_dict
    model_config = model_config_from_unet(sd)
  File "D:\c2\custom_nodes\ComfyUI_ExtraModels\Sana\loader.py", line 92, in model_config_from_unet
    if config["hidden_size"] == 1152:

Jan 01 '25 21:01 patientx

Any update on this?

Jan 29 '25 05:01 RandomGuyWithIssues

It's worth noting that when Lumina Image 2 support got added to ComfyUI, that also uses Gemma for a text encoder, so you may be able to use the builtin Gemma support now.

Feb 08 '25 00:02 arcum42

Any news on this?

Feb 27 '25 00:02 frutiemax92

Any news on this?

I am getting the black output as well.... hopefully this get fix ASAP. any one?

Mar 14 '25 03:03 future-knowin