r3d3 icon indicating copy to clipboard operation
r3d3 copied to clipboard

Test in a new scene

Open JOP-Lee opened this issue 2 years ago • 2 comments

Hello, I want to know whether the pre-trained model can be used to estimate the absolute depth map in a new scene, such as inputting an rgb image or a video sequence. If so, how can the scale information of multiple depth maps estimated by the pre-trained model be obtained? I want to splice multiple depth maps into a point cloud, as your video demo shows. Do you have any suggestions? I would appreciate it very much.

JOP-Lee avatar Sep 14 '23 12:09 JOP-Lee

Hi @JOP-Lee. All predictions made by our evaluation pipeline are in metric scale. Metric scale is recovered through matched features in the overlapping regions of the calibrated multi-camera system (see Sec. 3.2. in the paper). Furthermore, the prior of the completion network is also in metric scale because it was trained with metric scale poses resulting from the former. Thus, you can combine the resulting depth maps into a point cloud as follows ...

  1. Run evaluate.py and save the predicted depth maps and poses
  2. Unproject depth maps by using the respective camera intrinsics
  3. Transform points from the camera reference frame to the world reference frame with the estimated poses
  4. (Optional) Filter outliers

AronDiSc avatar Sep 14 '23 17:09 AronDiSc

@AronDiSc Thank you for your response. Can you provide a script for testing a single image or multiple camera new scenes? It seems that evaluate.py is used for evaluating the DDAD and nuScenes datasets, and it requires masks and poses. However, for beginners testing new scenes, it would be very convenient to synthesize depth maps from single images (similar to monodepth2).

JOP-Lee avatar Sep 15 '23 03:09 JOP-Lee