diffusers Fine-Grained Subject-Specific Attribute Control

Model/Pipeline/Scheduler description

This paper shows you can control attributes of subjects in a text-to-image model. E.g., for the prompt "A cat and dog", you can change the attributes of e.g. dog. For example, make it larger, smaller, darker, lighter etc while keeping the rest of the image unchanged.

In this example from the paper, the attribute of "man" corresponding to age is varied: CleanShot 2024-03-27 at 18 31 17@2x

I think this is a fantastic addition to the SD/SDXL image-to-image pipelines.

I'm happy to implement this, if there's interest from the community.

Open source status

[X] The model implementation is available.
[ ] The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

Official repo: https://github.com/CompVis/attribute-control by @stefan-baumann @kliyer-ai

Mar 27 '24 17:03 UmerHA

That would be awesome if you could do an integration! If you need any help or guidance, let me know, and I'll happily help!

Mar 27 '24 17:03 stefan-baumann

Hi @sayakpaul , @UmerHA I would like to take this up. I am new to contributing to diffusers, would love some pointers to get started.

Mar 29 '24 16:03 RamitPahwa

@RamitPahwa I can guide you if you want. First steps (I'll be very basic as I don't your level):

read the contribution guidelines & design philosophy
fork the repo & create a new branch
understand the paper methodology
define scope (the paper has inference, finding new attribute controls without training, and finding new attribute controls with training. I'd say start with only inference first)
Play with the paper's code base to understand it
design, code, test & document your solution (I think inference can be done with callbacks)
publish your PR!

Happy to answer any questions you have!

Mar 29 '24 18:03 UmerHA

While I don't know that much about diffuser's design philosophy (otherwise, I would have created a PR myself already), I'm happy to answer any questions about the method and the reference implementation.

One good reference that shows a minimal example of what needs to be done on top of standard diffusers pipelines should be the real image editing example notebook I added, as it primarily relies on diffusers for inference: https://github.com/CompVis/attribute-control/tree/main/notebooks/real_image_editing

Mar 29 '24 21:03 stefan-baumann

@RamitPahwa hey, any updates? :)

Apr 03 '24 18:04 UmerHA

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Apr 28 '24 15:04 github-actions[bot]

Hey @UmerHA I would like to work on this I have already go through the paper. Is there any progress on this ?

Jun 23 '24 15:06 akiseakusa

Hey @akiseakusa , no progress - feel free to take it up!

Jun 23 '24 15:06 UmerHA

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sep 14 '24 15:09 github-actions[bot]