ComfyUI icon indicating copy to clipboard operation
ComfyUI copied to clipboard

Feature/qwen eligen support

Open nolan4 opened this issue 3 months ago • 26 comments

Pull Request: Add Entity-Level Image Generation (EliGen) for Qwen Image

Summary

This update implements Entity-Level Image Generation (EliGen) for the Qwen Image model, allowing region-specific prompts through spatial masks. The feature provides fine-grained control over image generation by applying separate attention masks for each entity.

Key Features • Spatial attention masking with isolated entity prompts • Automatic mask resizing to match latent dimensions • RoPE embedding implementation aligned with DiffSynth Studio • Support for batch_size > 1 • Backward compatible with standard Qwen Image workflows

nolan4 avatar Oct 25 '25 02:10 nolan4

Hey @nolan4 thank you so much for this amazing work! I am trying to test this out but not able to run it, I am using this PR and using the Eligen Lora provided by Diffsynth Studio and using this workflow, please check.

krigeta avatar Oct 28 '25 06:10 krigeta

hey @nolan4 may you please reply?

krigeta avatar Oct 29 '25 18:10 krigeta

Hi @krigeta — here’s a screenshot of my workflow, which is based on the Qwen text-to-image template. I’m also using the same EliGen LoRA from Diffsynth Studio that you linked. Hope this helps you get it running! eligen example workflow

nolan4 avatar Oct 29 '25 19:10 nolan4

Hey @nolan4 thank you so much for this, what if this branch is not merged then is it possible to create a custom node of this?

and yeah i will test this and share the results for sure.

krigeta avatar Oct 29 '25 20:10 krigeta

hey @nolan4 it is not working in my case, please check. Screenshot (99)

krigeta avatar Oct 30 '25 18:10 krigeta

Looks like you have multiple loras, try just one for testing purposes. I tried your prompt with a few minor modifications: dragonball z example

nolan4 avatar Oct 30 '25 23:10 nolan4

test locally and works, comfy will do a code review to see if anything else needs to be changed!

Kosinkadink avatar Oct 31 '25 01:10 Kosinkadink

Looks like you have multiple loras, try just one for testing purposes. I tried your prompt with a few minor modifications: dragonball z example

Will gonna test it asap and this time I am gonna test the official example as shown in the official Eligen lora implementation and one more thing I want to ask:

In the Diffsynth studio repo, they said it is important to make those colored masks with text overlays. Is that true?

As in their official example, the masks are overlaid to achieve smooth results.

krigeta avatar Oct 31 '25 07:10 krigeta

2123 How can I get this node?

Amazon90 avatar Oct 31 '25 08:10 Amazon90

2123 How can I get this node?

you have to install this PR and not the actual comfyUI setup to use this node, as of now it is not the part of the main repo.

krigeta avatar Nov 01 '25 11:11 krigeta

Hey @nolan4, I guess this implementation is missing the colour-coded masks that help the lora to differentiate between the regions when they overlap. Please look into it.

krigeta avatar Nov 02 '25 13:11 krigeta

btw something similar was implemented in the Inspire Pack called "regional conditioning by color masks" in case you need inspiration or code https://github.com/ltdrdata/ComfyUI-extension-tutorials/blob/Main/ComfyUI-Inspire-Pack/workflow/RegionalCFG-RegionalConditioning.png RegionalCFG-RegionalConditioning https://github.com/ltdrdata/ComfyUI-Inspire-Pack

geroldmeisinger avatar Nov 02 '25 18:11 geroldmeisinger

How can I get this node?

comfy-cli --workspace ./ComfyUI_eligen install --pr "#10473"
comfy-cli --workspace ./ComfyUI_eligen launch

geroldmeisinger avatar Nov 02 '25 18:11 geroldmeisinger

Looks like you have multiple loras, try just one for testing purposes. I tried your prompt with a few minor modifications

I can confirm it works: 8step lora -> eligen lora -> ksampler

I guess this implementation is missing the colour-coded masks that help the lora to differentiate between the regions when they overlap. Please look into it.

I can confirm it works

Screenshot from 2025-11-02 21-01-08

eligen_colorspheres.json

mask_ball1 mask_ball2 mask_ball3

note i use euler+beta, cfg=1, 1328x1328 but that shouldn't make much difference

geroldmeisinger avatar Nov 02 '25 19:11 geroldmeisinger

Hey @geroldmeisinger , thank you so much for the share, is there any other social media where we can chat?

krigeta avatar Nov 03 '25 05:11 krigeta

here are the original masks https://www.modelscope.cn/datasets/DiffSynth-Studio/examples_in_diffsynth/files data/examples/eligen/entity_control of the eligen demo page https://www.modelscope.cn/models/DiffSynth-Studio/Qwen-Image-EliGen

geroldmeisinger avatar Nov 03 '25 08:11 geroldmeisinger

here are the original masks https://www.modelscope.cn/datasets/DiffSynth-Studio/examples_in_diffsynth/files data/examples/eligen/entity_control of the eligen demo page https://www.modelscope.cn/models/DiffSynth-Studio/Qwen-Image-EliGen

These are the ones I got to know about Eligen. In my case, the overlapped masking is not working properly, as I want to create two characters in front of each other, and the view is from the back. Or I would say I use one entity for a character and another for another character, and with ControlNet? Does it work?

And if possible, may you share your Discord or other social app?

krigeta avatar Nov 03 '25 10:11 krigeta

image

geroldmeisinger avatar Nov 08 '25 17:11 geroldmeisinger

image

Haha indeed, they are needed btw, I want to ask, are you able to get the result shown here?

image image

krigeta avatar Nov 08 '25 17:11 krigeta

@geroldmeisinger @krigeta

Latest version supports 8 entities! Below is a screenshot for the referenced DiffSynth example.

reference example

nolan4 avatar Nov 10 '25 00:11 nolan4

awesome! see https://docs.comfy.org/custom-nodes/backend/more_on_inputs#dynamically-created-inputs in case you want to make it dynamic

@krigeta https://github.com/krigeta/eligen_test/issues/1

geroldmeisinger avatar Nov 10 '25 06:11 geroldmeisinger

There will be a native growing input type soon - any custom javascript to do this will not be allowed in core outside of a 'general' implementation! That will allow an indefinite amount of inputs into the node.

Kosinkadink avatar Nov 14 '25 00:11 Kosinkadink