What does this PR do?

Add Lumina-T2X to diffusers

Fixes https://github.com/huggingface/diffusers/pull/8652

Before submitting

[x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[x] Did you read the contributor guideline?
[x] Did you read our philosophy doc (important for complex PRs)?
[x] Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
[x] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
[x] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

Jun 20 '24 16:06 PommesPeter

I reviewed Attenton and LuminaAttnProcessor2_0. looking very nice! I Ieft some questions:) most importantly I want to understand the kv_heads variable we added to Attention - Is this based on your research or some other paper? why do we give k and v smaller dimensions and then duplicate them for the attention calculation?

Yes, this is called Grouped Query Attention proposed in this paper, which can optimize training and inference efficiency.

Jul 01 '24 08:07 zhuole1025

We have fixed all the problems above. should we go next?

Jul 04 '24 07:07 PommesPeter

@PommesPeter thanks! can we fix the failing CI? looking at it, I think you need to:

add the new doc pages to https://github.com/huggingface/diffusers/blob/main/docs/source/en/_toctree.yml
run make style and make fix-copies

we will wait for @DN6 to do a review also in the meantime!

Jul 04 '24 17:07 yiyixuxu

@PommesPeter thanks! can we fix the failing CI? looking at it, I think you need to:

add the new doc pages to https://github.com/huggingface/diffusers/blob/main/docs/source/en/_toctree.yml

run make style and make fix-copies

we will wait for @DN6 to do a review also in the meantime!

okay, we have added all our docs and run make style and make fix-copies for current branch.

Jul 04 '24 17:07 PommesPeter

can you run make fix-copies again?

Jul 05 '24 08:07 yiyixuxu

can you run make fix-copies again?

run it~

Jul 05 '24 08:07 PommesPeter

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Jul 05 '24 08:07 HuggingFaceDocBuilderDev

@PommesPeter lumina tests still fail I think we need to update the lumina tests now because we made updates to the model

Jul 05 '24 16:07 yiyixuxu

@PommesPeter lumina tests still fail I think we need to update the lumina tests now because we made updates to the model

yep, we have fixed the problem from test class.

Jul 05 '24 16:07 PommesPeter

@PommesPeter we need the make style again, sorry!

Jul 05 '24 16:07 yiyixuxu

not sure what's the status of the pr since simple load is still failing?

pip install git+https://github.com/PommesPeter/diffusers@lumina

import torch
from diffusers import LuminaText2ImgPipeline
pipe = LuminaText2ImgPipeline.from_pretrained("Alpha-VLLM/Lumina-Next-SFT-diffusers", torch_dtype=torch.bfloat16).cuda()

ValueError: Cannot load <class 'diffusers.models.transformers.lumina_nextdit2d.LuminaNextDiT2DModel'> from /mnt/models/Diffusers/models--Alpha-VLLM--Lumina-Next-SFT-diffusers/snapshots/f82702c1b6a9bac3db9155edad1fd8dbf088cdf6/transformer because the following keys are missing:
...

Jul 06 '24 00:07 vladmandic

not sure what's the status of the pr since simple load is still failing?

pip install git+https://github.com/PommesPeter/diffusers@lumina

import torch
from diffusers import LuminaText2ImgPipeline
pipe = LuminaText2ImgPipeline.from_pretrained("Alpha-VLLM/Lumina-Next-SFT-diffusers", torch_dtype=torch.bfloat16).cuda()

ValueError: Cannot load <class 'diffusers.models.transformers.lumina_nextdit2d.LuminaNextDiT2DModel'> from /mnt/models/Diffusers/models--Alpha-VLLM--Lumina-Next-SFT-diffusers/snapshots/f82702c1b6a9bac3db9155edad1fd8dbf088cdf6/transformer because the following keys are missing:
...

sorry, we have network problem in pushing our newest model to huggingface. I'm re-pushing the newest model for lumina.

Jul 06 '24 05:07 PommesPeter

not sure what's the status of the pr since simple load is still failing?

pip install git+https://github.com/PommesPeter/diffusers@lumina
import torch
from diffusers import LuminaText2ImgPipeline
pipe = LuminaText2ImgPipeline.from_pretrained("Alpha-VLLM/Lumina-Next-SFT-diffusers", torch_dtype=torch.bfloat16).cuda()
ValueError: Cannot load <class 'diffusers.models.transformers.lumina_nextdit2d.LuminaNextDiT2DModel'> from /mnt/models/Diffusers/models--Alpha-VLLM--Lumina-Next-SFT-diffusers/snapshots/f82702c1b6a9bac3db9155edad1fd8dbf088cdf6/transformer because the following keys are missing:
...
sorry, we have network problem in pushing our newest model to huggingface. I'm re-pushing the newest model for lumina.

Hi @vladmandic we have pushed our model to huggingface repo. could you re-pull the huggingface model repo for testing you want?

Jul 06 '24 07:07 PommesPeter

@PommesPeter can you check if you need to update the slow tests? since the checkpoints have been updated a couple of times I will merge it tomorrow once the slow tests are updated

Jul 06 '24 08:07 yiyixuxu

@PommesPeter can you check if you need to update the slow tests? since the checkpoints have been updated a couple of times I will merge it tomorrow once the slow tests are updated

okay, i will fix the problem.

Jul 06 '24 08:07 PommesPeter

@PommesPeter confirmed as working with updated model on hf.

Jul 06 '24 13:07 vladmandic

merged! thank you!

Jul 08 '24 03:07 yiyixuxu

wow! thank you for your reviewing to our pr

Jul 08 '24 03:07 PommesPeter

[Alpha-VLLM Team] Add Lumina-T2X to diffusers

What does this PR do?

Before submitting

Who can review?