Fine-Grained Subject-Specific Attribute Control
Model/Pipeline/Scheduler description
This paper shows you can control attributes of subjects in a text-to-image model. E.g., for the prompt "A cat and dog", you can change the attributes of e.g. dog. For example, make it larger, smaller, darker, lighter etc while keeping the rest of the image unchanged.
In this example from the paper, the attribute of "man" corresponding to age is varied:
I think this is a fantastic addition to the SD/SDXL image-to-image pipelines.
I'm happy to implement this, if there's interest from the community.
Open source status
- [X] The model implementation is available.
- [ ] The model weights are available (Only relevant if addition is not a scheduler).
Provide useful links for the implementation
Official repo: https://github.com/CompVis/attribute-control by @stefan-baumann @kliyer-ai
That would be awesome if you could do an integration! If you need any help or guidance, let me know, and I'll happily help!
Hi @sayakpaul , @UmerHA I would like to take this up. I am new to contributing to diffusers, would love some pointers to get started.
@RamitPahwa I can guide you if you want. First steps (I'll be very basic as I don't your level):
- read the contribution guidelines & design philosophy
- fork the repo & create a new branch
- understand the paper methodology
- define scope (the paper has inference, finding new attribute controls without training, and finding new attribute controls with training. I'd say start with only inference first)
- Play with the paper's code base to understand it
- design, code, test & document your solution (I think inference can be done with callbacks)
- publish your PR!
Happy to answer any questions you have!
While I don't know that much about diffuser's design philosophy (otherwise, I would have created a PR myself already), I'm happy to answer any questions about the method and the reference implementation.
One good reference that shows a minimal example of what needs to be done on top of standard diffusers pipelines should be the real image editing example notebook I added, as it primarily relies on diffusers for inference: https://github.com/CompVis/attribute-control/tree/main/notebooks/real_image_editing
@RamitPahwa hey, any updates? :)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Hey @UmerHA I would like to work on this I have already go through the paper. Is there any progress on this ?
Hey @akiseakusa , no progress - feel free to take it up!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.