MagicDance Questions about Input Latent for Appearance Control Model

Could you please clarify whether the input to the appearance control model is the latent of the reference image or the noisy latent after DDIM inversion processing? thanks!!

Jun 12 '24 02:06 CrispyFeSo4

As mentioned in the paper and the code, it's the latent of reference image without adding noise.

Jun 12 '24 02:06 Boese0601

Thank you for your response. Since many previous training-free methods use the noisy latent after inversion, I had this question and wanted to confirm it.

By the way, have you tried using the latent after inversion? What differences in results have you observed compared to directly inputting the latent?

Jun 12 '24 02:06 CrispyFeSo4

I have tried before using the noisy latent as input to the appearance control model, with corresponding noise sampled from timestep t, but it looks from the result that it doesn't make much difference or even worse. I have implemented this part in the code as well, just simply set wonoise in the arguments to False.

Jun 12 '24 03:06 Boese0601

Thank you for your detailed response!! I suspect that it might be because the trainable models can handle both noisy and non-noisy latents, while training-free methods can only handle non-noisy latents. The trainable models can extract more information from the non-noisy latent as a reference. Good job!

Jun 12 '24 03:06 CrispyFeSo4

你好，可以问一下代码中的外观控制模型是哪一个文件的类吗

Aug 29 '24 10:08 bbing32475