diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

[Core] refactor embeddings.py so that it becomes a nice module.

Open sayakpaul opened this issue 1 year ago • 1 comments

What does this PR do?

The embeddings.py script is becoming long and bloated.

This PR breaks the script meaningfully so that it can be supplemented as a module.

  1. Creates a embeddings_utils.py to have common methods shared across the board.
  2. Moves the IP Adapter embeddings blocks to a dedicated embeddings/ip_adapter.pt.
  3. Does the same for PixArt-related embeddings, too.

This should be very much backwards-preserving.

Once I have an initial approval, I will make sure all the embedding classes are documented, too.

sayakpaul avatar May 21 '24 10:05 sayakpaul

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yiyixuxu and @DN6 did you have a chance to think through this? :)

sayakpaul avatar Jun 24 '24 12:06 sayakpaul

I'm cool to divide them up a little bit by "type":

  • positional embeddings is a pretty big category on its own,
  • timesteps
  • text/image: we only have one text projection embedding so ideally just combine them; they are all simply projection layers
  • other: gligen bounding box, label embedding
  • combined

yiyixuxu avatar Jun 24 '24 20:06 yiyixuxu

Yeah that sounds reasonable to me. I will let @DN6 also comment here and I will take a stab :)

sayakpaul avatar Jun 25 '24 01:06 sayakpaul

Agree to breaking up by type as well. @yiyixuxu's suggestions make sense to me.

DN6 avatar Jun 27 '24 07:06 DN6

Closing because of brutal merge conflicts. Let's jam on https://github.com/huggingface/diffusers/pull/8722, instead.

sayakpaul avatar Jun 27 '24 11:06 sayakpaul