MagicDance icon indicating copy to clipboard operation
MagicDance copied to clipboard

Questions about Input Latent for Appearance Control Model

Open CrispyFeSo4 opened this issue 1 year ago • 5 comments

Could you please clarify whether the input to the appearance control model is the latent of the reference image or the noisy latent after DDIM inversion processing? thanks!!

CrispyFeSo4 avatar Jun 12 '24 02:06 CrispyFeSo4

As mentioned in the paper and the code, it's the latent of reference image without adding noise.

Boese0601 avatar Jun 12 '24 02:06 Boese0601

Thank you for your response. Since many previous training-free methods use the noisy latent after inversion, I had this question and wanted to confirm it.

By the way, have you tried using the latent after inversion? What differences in results have you observed compared to directly inputting the latent?

CrispyFeSo4 avatar Jun 12 '24 02:06 CrispyFeSo4

I have tried before using the noisy latent as input to the appearance control model, with corresponding noise sampled from timestep t, but it looks from the result that it doesn't make much difference or even worse. I have implemented this part in the code as well, just simply set wonoise in the arguments to False.

Boese0601 avatar Jun 12 '24 03:06 Boese0601

Thank you for your detailed response!! I suspect that it might be because the trainable models can handle both noisy and non-noisy latents, while training-free methods can only handle non-noisy latents. The trainable models can extract more information from the non-noisy latent as a reference. Good job!

CrispyFeSo4 avatar Jun 12 '24 03:06 CrispyFeSo4

你好,可以问一下代码中的外观控制模型是哪一个文件的类吗

bbing32475 avatar Aug 29 '24 10:08 bbing32475