DiffusionDepth Question about DDIM loss

Hi, thanks for your wonderful work! When I was reading the code, I noticed that you took the time embedding on the feature extracted from the RGB images. I am wondering if it is better to take the time embedding on the depth output by the decoder (namely 'refined_depth' defined in your code), or just annotated depth with masks. Thanks for your work and codes again!

Aug 02 '23 13:08 zyp-byte

Hi, thanks for your question. At that time our thinking is more like to add the time embedding on a dense and consistent feature. This way is closer to the original diffusion model. I haven't try to put time embedding directly on depth map. Have you got any attempts on that? If the results are positive, I'm keen to have a improved version with you.

Aug 03 '23 00:08 duanyiqun

Hi, thanks for your question. At that time our thinking is more like to add the time embedding on a dense and consistent feature. This way is closer to the original diffusion model. I haven't try to put time embedding directly on depth map. Have you got any attempts on that? If the results are positive, I'm keen to have a improved version with you.

Hi ! Thanks to your nice job. I notive that you choose to predict the x0 instead of noise like DDPM. Can you share the reason with me? Thanks again~

Sep 22 '23 01:09 VLadImirluren