diffusers [Pipelines] On adding Pix2Pix Zero

Pix2Pix Zero presents a wonderful idea for editing the objects in a generated image with finer controls. Some really amazing results:

The code for Pix2Pix Zero is available here: https://github.com/pix2pixzero/pix2pix-zero.

IIUC, to add this pipeline to diffusers we need the following primary things:

Changes to cross attention so that the attention weights are book-kept.
The DDIM inversion scheduler (especially if we wanted to condition the pipeline on a real input image).
Potentially, an inversion pipeline that gives us the inverted noise from the starting image.
A pipeline that takes in the inverted latent (or some randomly sampled noise if we're not using a real input image) and gives us the edited image back by actually applying the methodology proposed in Pix2Pix Zero.

I guess it's best to add the following things separately:

DDIM inversion scheduler
DDIM inversion pipeline
Editing pipeline (including changes to attention)

WDYT? @patrickvonplaten @patil-suraj

Feb 13 '23 06:02 sayakpaul

I don't think we have to change the cross attention, instead we just create a pipeline specific attention processor class inside the pipeline that has a attribute self.probs that can be set.
Yes sounds good to me to create a DDIM inversion scheduler.
I would put everything that's needed for editing in one pipeline, no need to create multiple pipelines.

This pipeline is probably quite similar in its general idea to:

https://github.com/huggingface/diffusers/pull/2275

Feb 13 '23 07:02 patrickvonplaten

Thanks a lot for the issue!

Same comment as Patrick's for the attention weights
Is the scheduler significantly different from the DDIMScheduler ? If there is an example available, could you point to it ?
Inversiona and editing pipeline can go under same directory.

Feb 13 '23 07:02 patil-suraj

Thanks for your inputs @patrickvonplaten @patil-suraj!

Is the scheduler significantly different from the DDIMScheduler ? If there is an example available, could you point to it ?

This is where the inversion scheduler is implemented.

Feb 13 '23 07:02 sayakpaul

Okay if I understand it correctly, the only difference is the step, which is the inversion of the actual step., so it's not really a new scheduler. Wonder if we can just add the inversion_step or inverse_step method to the existing scheduler, up for discussion wdyt @patrickvonplaten @pcuenca @williamberman @yiyixuxu

Feb 13 '23 08:02 patil-suraj

I would prefer to add inverse_step(). This is much more complete IMO.

Feb 13 '23 08:02 sayakpaul

+1 if it's just the inverse of the actual step I would add it to the existing scheduler

Feb 15 '23 21:02 williamberman

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Mar 15 '23 15:03 github-actions[bot]

We have Pix2Pix Zero in diffusers already.

https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/pix2pix_zero

Mar 15 '23 15:03 sayakpaul