dynibar icon indicating copy to clipboard operation
dynibar copied to clipboard

Question about camera poses

Open Loydian opened this issue 2 years ago • 2 comments

Thanks for your extraordinary work and the released code. But I found that, after you transform the camera poses from llff format to nerf format with the following code,

  poses = np.concatenate(
      [poses[:, 1:2, :], -poses[:, 0:1, :], poses[:, 2:, :]], 1
  )

but you inverse the y and z axis again by the parse_llff_pose function as follows. What is this for?

def parse_llff_pose(pose):
  """convert llff format pose to 4x4 matrix of intrinsics and extrinsics."""

  h, w, f = pose[:3, -1]
  c2w = pose[:3, :4]
  c2w_4x4 = np.eye(4)
  c2w_4x4[:3] = c2w
  c2w_4x4[:, 1:3] *= -1
  intrinsics = np.array(
      [[f, 0, w / 2.0, 0], [0, f, h / 2.0, 0], [0, 0, 1, 0], [0, 0, 0, 1]]
  )
  return intrinsics, c2w_4x4

This really confuses me.

Loydian avatar Jul 28 '23 03:07 Loydian

It seems that you transform poses into opencv format. What is this for?

Loydian avatar Jul 28 '23 03:07 Loydian

This is some historical/compatibility reasons to keep code same as original NeRF and IBRNet data format and codebase. Basically, the script loads colmap format and convert and save to llff format. In the end, the dataloader will transform it into colmap again. So baiscally, opencv in and opencv out for training and inference

zhengqili avatar Aug 09 '23 21:08 zhengqili