diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

[Feature Request][Community] Ability to pass text_embeddings/uncond_embeddings as arguments in pipe call

Open hadaev8 opened this issue 3 years ago • 12 comments

Is your feature request related to a problem? Please describe. Im experimenting with aesthetic gradients and need to overwrite pip call to pass text_embeddings/uncond_embeddings. Also it might save a bit of time with making a lot of images with same promt.

Describe the solution you'd like Ability to pass text_embeddings/uncond_embeddings to pipe call.

Describe alternatives you've considered Idk. Maybe split everything to separate functions and make it less cluttered.

hadaev8 avatar Oct 23 '22 18:10 hadaev8

Hey @hadaev8,

Do you think we could use a community pipeline for this? If there is a big use case, it'd also be ok for me to add it to the native stable diffusion pipelines (wdyt @patil-suraj @anton-l @pcuenca ?)

patrickvonplaten avatar Oct 26 '22 10:10 patrickvonplaten

Also see: https://github.com/huggingface/diffusers/pull/958 -> in general ok for me this request - curious to hear other opinions though!

patrickvonplaten avatar Oct 26 '22 12:10 patrickvonplaten

Since aesthetic gradients modify text encoder output it works with every pipe. Didnt tested with new inpainting, but why not. So i think separate pipe is not a good way.

Also, if it is acceptable in this repo, i would like to contribute example notebook or something showing how it work with all default pipes.

hadaev8 avatar Oct 26 '22 13:10 hadaev8

@hadaev8 I'd be interested to see that, do you have a colab available?

dblunk88 avatar Oct 26 '22 13:10 dblunk88

Hey @hadaev8 ! Could you point us to an example of aesthetic gradients ? Hearing it for the first time :)

If it's a really big use case I would also be in favor of it. For now I see two things which could benifit from this

  • imagic #958 -> I'm not sure if we can modify the pipeline for this, as the trained checkpoints are not really re-usable and are specific to the prompt and image being edited.
  • and stable diffusion videos where we interpolate text embeddings -> but this requires lots of additional stuff and already has a repo and custom pipeline

so unless we have a really big use case, I would like to keep the pipelines simple :)

patil-suraj avatar Oct 26 '22 13:10 patil-suraj

Hey @hadaev8 ! Could you point us to an example of aesthetic gradients ? Hearing it for the first time :)

If it's a really big use case I would also be in favor of it. For now I see two things which could benifit from this

  • imagic Add imagic to community pipelines #958 -> I'm not sure if we can modify the pipeline for this, as the trained checkpoints are not really re-usable and are specific to the prompt and image being edited.
  • and stable diffusion videos where we interpolate text embeddings -> but this requires lots of additional stuff and already has a repo and custom pipeline

so unless we have a really big use case, I would like to keep the pipelines simple :)

I've seen the results of it, definitely worth taking a look. The end results are amazing

dblunk88 avatar Oct 26 '22 13:10 dblunk88

@patil-suraj This repo https://github.com/vicgalle/stable-diffusion-aesthetic-gradients

Basically it change weights of text encoder to match clip image representations.

Almost all possible to do outside of pipe, but because of such tuning catastrophic forgetting kicks in, so i think (and author do it too) its better to pass unchanged uncond embeddings from original text model.

In my notebooks i just copypasted whole pipe function for very minor change. Ofc its my problem, but flexibility is always good.

hadaev8 avatar Oct 26 '22 14:10 hadaev8

@dblunk88 https://colab.research.google.com/drive/1RXolb8ozC4qSCZfnfO-PdVSC25Aj1dTZ?usp=sharing Have fun

hadaev8 avatar Oct 26 '22 14:10 hadaev8

Awesome, thanks @hadaev8 !

patil-suraj avatar Oct 26 '22 16:10 patil-suraj

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Nov 23 '22 15:11 github-actions[bot]

This still would be nice to have

hadaev8 avatar Nov 25 '22 02:11 hadaev8

Actually fine for to add this ! Would someone be interested in opening a PR for this? I won't find time anytime soon, but I'll keep this issue on my radar in case more people ask for it

patrickvonplaten avatar Nov 30 '22 11:11 patrickvonplaten

@patrickvonplaten I want to do it. How do you think should I use prompt and negative prompt variables and put checks if it already tensors? Like image arg already does.

hadaev8 avatar Dec 22 '22 15:12 hadaev8

Hey @hadaev8,

I think we should just add new variables just like we've done for UnCLIP here: https://github.com/huggingface/diffusers/pull/1858

Happy to help with a PR :-)

patrickvonplaten avatar Jan 03 '23 11:01 patrickvonplaten

Mostly solved by https://github.com/huggingface/diffusers/pull/2071

patrickvonplaten avatar Mar 18 '23 18:03 patrickvonplaten