diffusers refactors pipelines inheriting from `StableDiffusionPipeline`

We have a couple of community pipelines that used StableDiffusionPipeline as the base class instead of DiffusionPipeline. Most of them are broken after we changed the signature of StableDiffusionPipeline. see more details on the issue here https://github.com/huggingface/diffusers/issues/6969#issuecomment-1944162819

pipelines should not use StableDiffusionPipeline as a base class. Is anyone interested in refactoring some of these community pipelines to inherit from DiffusionPipeline instead? You will need to:

update the base class to use DiffusionPipeline and any of the mixins needed e.g.

class CommunityPipeline(
    DiffusionPipeline, TextualInversionLoaderMixin, LoraLoaderMixin, IPAdapterMixin, FromSingleFileMixin
):

if the community pipeline use some of the methods in StableDiffusionPipeline - you need use #Copied from statement to copy over methods to your community pipeline. Learn more about the Copied from mechanism here https://huggingface.co/docs/diffusers/conceptual/contribution#copied-from-mechanism

Here is the list of pipelines that need to be refactored!

[x] Prompt2PromptPipeline(completed by @ihkap11 #7211 ): https://github.com/huggingface/diffusers/blob/main/examples/community/pipeline_prompt2prompt.py
[ ] RegionalPromptingStableDiffusionPipeline (@pranjalks is working on it): https://github.com/huggingface/diffusers/blob/main/examples/community/regional_prompting_stable_diffusion.py
[x] StableDiffusionReferencePipeline ( completed by @standardAI https://github.com/huggingface/diffusers/pull/7071)
[ ] TensorRTStableDiffusionPipeline (@Bhavay-2001 is working on it): https://github.com/huggingface/diffusers/blob/main/examples/community/stable_diffusion_tensorrt_txt2img.py

Feb 15 '24 08:02 yiyixuxu

Hi @yiyixuxu, I am interested in resolving this issue. Please let me know if I can take this up.

Feb 15 '24 10:02 pranjalks

Thank you for this opportunity! I would like to take StableDiffusionReferencePipeline.

Feb 15 '24 11:02 tolgacangoz

hi @pranjalks sure! Just leave a message here about which ones you would like to work on:)

Feb 15 '24 20:02 yiyixuxu

@standardAI thanks!

Feb 15 '24 20:02 yiyixuxu

hi @pranjalks sure! Just leave a message here about which ones you would like to work on:)

Hey, I have started working on RegionalPromptingStableDiffusionPipeline. Thank you!

Feb 15 '24 20:02 pranjalks

@yiyixuxu I'll work on Prompt2Prompt pipeline. Thank you!

Feb 15 '24 21:02 nxbringr

Hi @yiyixuxu, I would be happy to work TensorRTStableDiffusionPipeline. Can you please assign it to me? Thanks

Feb 16 '24 17:02 Bhavay-2001

@Bhavay-2001 sure!

Feb 16 '24 17:02 yiyixuxu

hi @pranjalks sure! Just leave a message here about which ones you would like to work on:)

Hey, I have started working on RegionalPromptingStableDiffusionPipeline. Thank you!

Hi @pranjalks is there any news on it? I have already used it and not working, wondering if it is finished, or should I also take a look and try to refactor it:)

May 13 '24 15:05 katarzynasornat

@katarzynasornat feel free to open a PR:)

May 13 '24 19:05 yiyixuxu

@yiyixuxu @katarzynasornat

I took a crack at upgrading the Regional Prompter but even with using the mix-ins I'm still getting this error:

ValueError: Pipeline <class 'regional_prompting_stable_diffusion.RegionalPromptingStableDiffusionPipeline'> expected {'safety_checker', 'vae', 'unet', 'scheduler', 'feature_extractor', 'tokenizer', 'text_encoder'}, but only {'vae', 'unet', 'scheduler', 'tokenizer', 'text_encoder'} were passed.

Revised instantiation cribbed from #7071; I'm likely importing too much but nothing less worked either.

class RegionalPromptingStableDiffusionPipeline(DiffusionPipeline, TextualInversionLoaderMixin, LoraLoaderMixin, IPAdapterMixin, FromSingleFileMixin):

Python isn't my main language though so perhaps I'm doing something dumb or underestimating the amount of refactoring needed.

May 16 '24 22:05 jelling

@yiyixuxu @katarzynasornat

I took a crack at upgrading the Regional Prompter but even with using the mix-ins I'm still getting this error:
ValueError: Pipeline <class 'regional_prompting_stable_diffusion.RegionalPromptingStableDiffusionPipeline'> expected {'safety_checker', 'vae', 'unet', 'scheduler', 'feature_extractor', 'tokenizer', 'text_encoder'}, but only {'vae', 'unet', 'scheduler', 'tokenizer', 'text_encoder'} were passed.
Revised instantiation cribbed from #7071; I'm likely importing too much but nothing less worked either.
class RegionalPromptingStableDiffusionPipeline(DiffusionPipeline, TextualInversionLoaderMixin, LoraLoaderMixin, IPAdapterMixin, FromSingleFileMixin):
Python isn't my main language though so perhaps I'm doing something dumb or underestimating the amount of refactoring needed.

@jelling @yiyixuxu I will try to help to refactor it but before doing that (as I am quite new to diffusers and not a pro in Deep Learning but I am very fast student:)) I wanted to downgrade to a lower version of diffusers - before the change happened. Unfortunately I had another error which made me impossible to run and see if/how the pipeline works.

Below I enclosed my gist. Would you be kind to take a look what could go wrong? Otherwise hard to improve the pipeline which may not working anyway.

May 17 '24 11:05 katarzynasornat

@jelling Tip: 2 parameters were expected additionally, right? You can see something about these 2 parameters in my PR.

May 17 '24 13:05 tolgacangoz

@jelling Tip: 2 parameters were expected additionally, right? You can see something about these 2 parameters in my PR.

@standardAI Are you able to look at my gist after downgrading diffusers? Because after refactoring I am getting the same error and I am not sure if refactoring went wrong or the pipeline is somehow not working itself.

Thank you!

EDIT: error looks somehow like this

RuntimeError Traceback (most recent call last) in <cell line: 1>() ----> 1 images = pipe( 2 prompt=prompt, 3 negative_prompt="""ugly""", 4 guidance_scale=7.5, 5 height = 512,

18 frames in forward(hidden_states, encoder_hidden_states, attention_mask, temb, scale) 512 head_dim = inner_dim // attn.heads 513 --> 514 query = query.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2) 515 516 key = key.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2)

RuntimeError: shape '[5, -1, 8, 40]' is invalid for input of size 5242880

May 18 '24 11:05 katarzynasornat

Debugging for a previous version of diffusers might be unproductive, IMHO. If you open a refactoring PR I can guide you. You can take a look at the previously merged refactorings within this issue.

May 26 '24 08:05 tolgacangoz

Debugging for a previous version of diffusers might be unproductive, IMHO. If you open a refactoring PR I can guide you. You can take a look at the previously merged refactorings within this issue.

@tolgacangoz so cool! Will open a PR today. Assume I should work on current diffusers version 0.28.0.dev0? I took a look at the previous examples, what I found is that all of them (I think) where using StableDiffusionPipelineOutput, but regional prompt is invoking StableDiffusionPipeline inside. Happy to work and finish it properly with your guidance!

May 28 '24 09:05 katarzynasornat