diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Fine-Grained Subject-Specific Attribute Control

Open UmerHA opened this issue 1 year ago • 8 comments

Model/Pipeline/Scheduler description

This paper shows you can control attributes of subjects in a text-to-image model. E.g., for the prompt "A cat and dog", you can change the attributes of e.g. dog. For example, make it larger, smaller, darker, lighter etc while keeping the rest of the image unchanged.

In this example from the paper, the attribute of "man" corresponding to age is varied: CleanShot 2024-03-27 at 18 31 17@2x

I think this is a fantastic addition to the SD/SDXL image-to-image pipelines.

I'm happy to implement this, if there's interest from the community.

Open source status

  • [X] The model implementation is available.
  • [ ] The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

Official repo: https://github.com/CompVis/attribute-control by @stefan-baumann @kliyer-ai

UmerHA avatar Mar 27 '24 17:03 UmerHA

That would be awesome if you could do an integration! If you need any help or guidance, let me know, and I'll happily help!

stefan-baumann avatar Mar 27 '24 17:03 stefan-baumann

Hi @sayakpaul , @UmerHA I would like to take this up. I am new to contributing to diffusers, would love some pointers to get started.

RamitPahwa avatar Mar 29 '24 16:03 RamitPahwa

@RamitPahwa I can guide you if you want. First steps (I'll be very basic as I don't your level):

  • read the contribution guidelines & design philosophy
  • fork the repo & create a new branch
  • understand the paper methodology
  • define scope (the paper has inference, finding new attribute controls without training, and finding new attribute controls with training. I'd say start with only inference first)
  • Play with the paper's code base to understand it
  • design, code, test & document your solution (I think inference can be done with callbacks)
  • publish your PR!

Happy to answer any questions you have!

UmerHA avatar Mar 29 '24 18:03 UmerHA

While I don't know that much about diffuser's design philosophy (otherwise, I would have created a PR myself already), I'm happy to answer any questions about the method and the reference implementation.

One good reference that shows a minimal example of what needs to be done on top of standard diffusers pipelines should be the real image editing example notebook I added, as it primarily relies on diffusers for inference: https://github.com/CompVis/attribute-control/tree/main/notebooks/real_image_editing

stefan-baumann avatar Mar 29 '24 21:03 stefan-baumann

@RamitPahwa hey, any updates? :)

UmerHA avatar Apr 03 '24 18:04 UmerHA

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Apr 28 '24 15:04 github-actions[bot]

Hey @UmerHA I would like to work on this I have already go through the paper. Is there any progress on this ?

akiseakusa avatar Jun 23 '24 15:06 akiseakusa

Hey @akiseakusa , no progress - feel free to take it up!

UmerHA avatar Jun 23 '24 15:06 UmerHA

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Sep 14 '24 15:09 github-actions[bot]