Question about camera poses
Thanks for your extraordinary work and the released code. But I found that, after you transform the camera poses from llff format to nerf format with the following code,
poses = np.concatenate(
[poses[:, 1:2, :], -poses[:, 0:1, :], poses[:, 2:, :]], 1
)
but you inverse the y and z axis again by the parse_llff_pose function as follows. What is this for?
def parse_llff_pose(pose):
"""convert llff format pose to 4x4 matrix of intrinsics and extrinsics."""
h, w, f = pose[:3, -1]
c2w = pose[:3, :4]
c2w_4x4 = np.eye(4)
c2w_4x4[:3] = c2w
c2w_4x4[:, 1:3] *= -1
intrinsics = np.array(
[[f, 0, w / 2.0, 0], [0, f, h / 2.0, 0], [0, 0, 1, 0], [0, 0, 0, 1]]
)
return intrinsics, c2w_4x4
This really confuses me.
It seems that you transform poses into opencv format. What is this for?
This is some historical/compatibility reasons to keep code same as original NeRF and IBRNet data format and codebase. Basically, the script loads colmap format and convert and save to llff format. In the end, the dataloader will transform it into colmap again. So baiscally, opencv in and opencv out for training and inference