diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

[Pipelines] On adding Pix2Pix Zero

Open sayakpaul opened this issue 3 years ago • 6 comments

Pix2Pix Zero presents a wonderful idea for editing the objects in a generated image with finer controls. Some really amazing results:

image

image

The code for Pix2Pix Zero is available here: https://github.com/pix2pixzero/pix2pix-zero.

IIUC, to add this pipeline to diffusers we need the following primary things:

  • Changes to cross attention so that the attention weights are book-kept.
  • The DDIM inversion scheduler (especially if we wanted to condition the pipeline on a real input image).
  • Potentially, an inversion pipeline that gives us the inverted noise from the starting image.
  • A pipeline that takes in the inverted latent (or some randomly sampled noise if we're not using a real input image) and gives us the edited image back by actually applying the methodology proposed in Pix2Pix Zero.

I guess it's best to add the following things separately:

  • DDIM inversion scheduler
  • DDIM inversion pipeline
  • Editing pipeline (including changes to attention)

WDYT? @patrickvonplaten @patil-suraj

sayakpaul avatar Feb 13 '23 06:02 sayakpaul

  • I don't think we have to change the cross attention, instead we just create a pipeline specific attention processor class inside the pipeline that has a attribute self.probs that can be set.
  • Yes sounds good to me to create a DDIM inversion scheduler.
  • I would put everything that's needed for editing in one pipeline, no need to create multiple pipelines.

This pipeline is probably quite similar in its general idea to:

  • https://github.com/huggingface/diffusers/pull/2275

patrickvonplaten avatar Feb 13 '23 07:02 patrickvonplaten

Thanks a lot for the issue!

  • Same comment as Patrick's for the attention weights
  • Is the scheduler significantly different from the DDIMScheduler ? If there is an example available, could you point to it ?
  • Inversiona and editing pipeline can go under same directory.

patil-suraj avatar Feb 13 '23 07:02 patil-suraj

Thanks for your inputs @patrickvonplaten @patil-suraj!

Is the scheduler significantly different from the DDIMScheduler ? If there is an example available, could you point to it ?

This is where the inversion scheduler is implemented.

sayakpaul avatar Feb 13 '23 07:02 sayakpaul

Okay if I understand it correctly, the only difference is the step, which is the inversion of the actual step., so it's not really a new scheduler. Wonder if we can just add the inversion_step or inverse_step method to the existing scheduler, up for discussion wdyt @patrickvonplaten @pcuenca @williamberman @yiyixuxu

patil-suraj avatar Feb 13 '23 08:02 patil-suraj

I would prefer to add inverse_step(). This is much more complete IMO.

sayakpaul avatar Feb 13 '23 08:02 sayakpaul

+1 if it's just the inverse of the actual step I would add it to the existing scheduler

williamberman avatar Feb 15 '23 21:02 williamberman

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Mar 15 '23 15:03 github-actions[bot]

We have Pix2Pix Zero in diffusers already.

https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/pix2pix_zero

sayakpaul avatar Mar 15 '23 15:03 sayakpaul