StructDepth icon indicating copy to clipboard operation
StructDepth copied to clipboard

Camera Intrinsics for pretrained model

Open stefanklut opened this issue 4 years ago • 2 comments

Hi there,

Could you please tell me what camera intrinsic are used for the pretrained models? I would like to see some depth results on a custom input in the form of a pointcloud, similar to #2. It is unclear to me what intrinsics I should use for these images: the values found in nyu_dataset? and with or without crop? Also what is the purpose of dividing by height and width?

Thanks in advance for your help

stefanklut avatar Dec 20 '21 16:12 stefanklut

The camera intrinsic will change with image crop, scale, and flip, please refer to section 4.1 in the paper and nyu_dataset.

About your questions:

  1. The nyu_dataset contains two different camera intrinsic settings for training and testing, you can find that the difference between them is that the crop size during testing is different from that during training, and during trainning camera intrinsic need to change with flip;
  2. The same as reply 1, different crop settings results different cx and cy;
  3. The purpose of dividing by height and width is to normalize the camera intrinsic first, and then restore them according to the zoom ratio.

If you use a custom input for testing, you need to define the camera intrinsic by yourself, you don’t need to follow our settings completely, whether or not the crop is based on your needs, as long as it is ensured that the image is matched with the camera intrinsic when the point cloud is restored.

huangyuan2020 avatar Dec 21 '21 06:12 huangyuan2020

Thank you for your help! I seem to have almost got it working, but just to make sure.

Am I correct in assuming that the depth values need to be inverted (1/depth) when taken from the .npy file, the results seem inverted otherwise.

Also I believe there is a typo in inference_single_image.py, where the resize of the image occurs. The PIL resize takes (thisH, thisW), which should be inversed (thisW, thisH), or is this intended? The PIL resize takes width as its first argument.

Finally how strict are the planar fitting constraints, the results I get have slightly bend walls. Is this still an issue with the intrinsics or can this still happen with the model?

Thanks again for the effort

stefanklut avatar Dec 21 '21 12:12 stefanklut