diff_sal
diff_sal copied to clipboard
Offical implemention of the paper DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
超级棒的工作!!令人眼前一亮! 我注意到在对比实验中,您对比了CASP-Net这个模型的定性和定量结果,但我在github和CASP-Net论文中没有找到CASP-Net代码和训练好的权重,您可以提供一下CASP-Net的开源地址吗,不胜感激!
您的这项工作非常出色!但我使用diffsal在各个视音数据集上的不同划分重现结果遇到了困难。我已下载存储库中给予的audio_visual最佳权重状态,但无法获得论文中相关指标数值。能否请您提供一些后续操作的指导?
Hello Author. There are some technical details I would like to ask you about the Temporal Conv operation in Figure 2 of the original article. Inputting [4, 16, 3, 224,...
where in the code you load fixation maps? i didn't find it in dataloader
Hello Author! In the sampling process of sample_ddim in diffusion_trainer.py, skip is set to 1000, which means that the time step is directly spanned from the span of 0-1000, and...
Hello Author! I would like to ask, the source code both visual.py and audio-visual.py call sal_unet in saliency_decoder, which doesn't match the structure proposed in the paper, ah? What's going...