ChenMingL
ChenMingL
Hello Author. There are some technical details I would like to ask you about the Temporal Conv operation in Figure 2 of the original article. Inputting [4, 16, 3, 224,...
Hello Author! In the sampling process of sample_ddim in diffusion_trainer.py, skip is set to 1000, which means that the time step is directly spanned from the span of 0-1000, and...
Hello Author! I would like to ask, the source code both visual.py and audio-visual.py call sal_unet in saliency_decoder, which doesn't match the structure proposed in the paper, ah? What's going...