Code for Multiview RGB-D Reconstruction
Hi, thanks for your great work!
I tried to reconstruct the 3D point cloud by directly unprojecting the foreground pixels using the predicted multiview RGB-D images, but I couldn’t get satisfactory results without post-processing.
Could you please release the code for your multiview RGB-D reconstruction?
Hi!
We perform two post-processing methods. First we mask the pixels we unproject via a threshold on the RGB image. Second we remove outliers, below is a sample function.
def filter_point_cloud(pts):
pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(pts)
downpcd = pcd.voxel_down_sample(voxel_size=0.01)
downpcd, ind = downpcd.remove_statistical_outlier(nb_neighbors=20, std_ratio=2.0)
return np.asarray(downpcd.points, np.float32)
Thank you for your response. I’ve unprojected the point clouds as you suggested like this:
depth_map = depth_map * mask
camera = PerspectiveCameras(focal_length=((2.1875, 2.1875),),
principal_point=((0,0),),
image_size=image_size,
R=R,
T=T,)
grid = torch.meshgrid(torch.arange(image_size[0]), torch.arange(image_size[1]))
grid = torch.stack(grid, dim=-1).float()
non_zero_mask = depth_map > 0
xy_masked = grid[non_zero_mask]
z_masked = depth_map[non_zero_mask]
xyz = torch.cat([xy_masked, z_masked.unsqueeze(-1)], dim=-1)
world_points = camera.unproject_points(xyz, world_coordinates=True)
pcds = o3d.geometry.PointCloud()
pcds.points = o3d.utility.Vector3dVector(world_points)
downpcd = pcds.voxel_down_sample(voxel_size=0.01)
downpcd, ind = downpcd.remove_statistical_outlier(nb_neighbors=20, std_ratio=2.0)
But I'm still not getting an ideal result. The depth scale seems off. Did you apply any scale and shift transformations on the predicted depth map?
If you are loading depth from .png [0-1], then you would need to unscale the values. If you are using pred_depth from model_output[:,4:,...] as in test.py, you would need to unnormalize it from [-1,1] to [0,1]. The Objaverse depths values should range from [0.5, 2.5].
def _unscale_depth( depths):
'''
Rescale depth from [0, 1] to [0.5, 2.5]
'''
shift = 0.5
scale = 2.0
depths = depths * scale + shift
return depths
def unnormalize(x):
'''
Unnormalize [-1, 1] to [0, 1]
'''
return torch.clip((x + 1.0) / 2.0, 0.0, 1.0)