[Pipelines] On adding Pix2Pix Zero
Pix2Pix Zero presents a wonderful idea for editing the objects in a generated image with finer controls. Some really amazing results:


The code for Pix2Pix Zero is available here: https://github.com/pix2pixzero/pix2pix-zero.
IIUC, to add this pipeline to diffusers we need the following primary things:
- Changes to cross attention so that the attention weights are book-kept.
- The DDIM inversion scheduler (especially if we wanted to condition the pipeline on a real input image).
- Potentially, an inversion pipeline that gives us the inverted noise from the starting image.
- A pipeline that takes in the inverted latent (or some randomly sampled noise if we're not using a real input image) and gives us the edited image back by actually applying the methodology proposed in Pix2Pix Zero.
I guess it's best to add the following things separately:
- DDIM inversion scheduler
- DDIM inversion pipeline
- Editing pipeline (including changes to attention)
WDYT? @patrickvonplaten @patil-suraj
- I don't think we have to change the cross attention, instead we just create a pipeline specific attention processor class inside the pipeline that has a attribute
self.probsthat can be set. - Yes sounds good to me to create a DDIM inversion scheduler.
- I would put everything that's needed for editing in one pipeline, no need to create multiple pipelines.
This pipeline is probably quite similar in its general idea to:
- https://github.com/huggingface/diffusers/pull/2275
Thanks a lot for the issue!
- Same comment as Patrick's for the attention weights
- Is the scheduler significantly different from the
DDIMScheduler? If there is an example available, could you point to it ? - Inversiona and editing pipeline can go under same directory.
Thanks for your inputs @patrickvonplaten @patil-suraj!
Is the scheduler significantly different from the DDIMScheduler ? If there is an example available, could you point to it ?
This is where the inversion scheduler is implemented.
Okay if I understand it correctly, the only difference is the step, which is the inversion of the actual step., so it's not really a new scheduler.
Wonder if we can just add the inversion_step or inverse_step method to the existing scheduler, up for discussion
wdyt @patrickvonplaten @pcuenca @williamberman @yiyixuxu
I would prefer to add inverse_step(). This is much more complete IMO.
+1 if it's just the inverse of the actual step I would add it to the existing scheduler
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
We have Pix2Pix Zero in diffusers already.
https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/pix2pix_zero