ZoeDepth icon indicating copy to clipboard operation
ZoeDepth copied to clipboard

Inaccurate depth estimation beyond 40m

Open mwdotzom opened this issue 2 years ago • 3 comments

Hello @thias15 @shariqfarooq123 , thank you for the great work!

I met two problems when inferring the models in outdoor car scenes:

  1. According to your description, model_zoe_k should be the one to choose here. However, model_zoe_n and model_zoe_k gave results of around 1m ~ 7m, only model_zoe_nk gave 7m ~ 65m, while gt is 1m ~ 80m. The latter is barely satisfactory for car instances within 10 ~ 40m(<2m error), however at close and far ranges the results seem remote from reality, for example the car front of the camera itself at 8.5m, and a distant car at gt = 65.8m with pred = 44.6m. The original RGB image can be downloaded here. image.png Something also worth mentioning is that the sky have pred results almost the same as the ground, which could be observed easily in the picture. This only happens in model_zoe_nk with mode="eval". Do you have any insights on how to improve the metric predictions at close and far distances? Would further training on datasets work? (yet what way could be beneficial given that it's already trained on 12 datasets...)

  2. As described similarly in issue #28, I tried both default mode ('infer') and mode="eval", but got same results. Could you provide a detailed example of the correct way to do it with torch.hub.load()?

Thank you for your time! :D

mwdotzom avatar Jun 19 '23 01:06 mwdotzom

yeah ,i meet the same problem

Ghul-huan avatar Aug 10 '23 07:08 Ghul-huan

model_zoe_k should be the one to choose here. However, model_zoe_n and model_zoe_k gave results of around 1m ~ 7m, only model_zoe_nk gave 7m ~ 65m

Yeah, I am also seeing this...

philippwulff avatar Jan 24 '24 20:01 philippwulff

any solution to tackle this, I meet the same problem :((

toannguyen1904 avatar Mar 28 '24 08:03 toannguyen1904