mvdfusion Code for Multiview RGB-D Reconstruction

Hi, thanks for your great work!
I tried to reconstruct the 3D point cloud by directly unprojecting the foreground pixels using the predicted multiview RGB-D images, but I couldn’t get satisfactory results without post-processing. Could you please release the code for your multiview RGB-D reconstruction?

Sep 20 '24 03:09 MiaApr

Hi!

We perform two post-processing methods. First we mask the pixels we unproject via a threshold on the RGB image. Second we remove outliers, below is a sample function.

def filter_point_cloud(pts):

    pcd = o3d.geometry.PointCloud()
    pcd.points = o3d.utility.Vector3dVector(pts)
    downpcd = pcd.voxel_down_sample(voxel_size=0.01)
    downpcd, ind = downpcd.remove_statistical_outlier(nb_neighbors=20, std_ratio=2.0)
    return np.asarray(downpcd.points, np.float32)

Oct 31 '24 16:10 zhizdev

Thank you for your response. I’ve unprojected the point clouds as you suggested like this:

depth_map = depth_map * mask
camera = PerspectiveCameras(focal_length=((2.1875, 2.1875),),
                            principal_point=((0,0),),
                            image_size=image_size,
                            R=R,
                            T=T,)
grid = torch.meshgrid(torch.arange(image_size[0]), torch.arange(image_size[1]))
grid = torch.stack(grid, dim=-1).float()

non_zero_mask = depth_map > 0
xy_masked = grid[non_zero_mask]
z_masked = depth_map[non_zero_mask]
xyz = torch.cat([xy_masked, z_masked.unsqueeze(-1)], dim=-1)
world_points = camera.unproject_points(xyz, world_coordinates=True)

pcds = o3d.geometry.PointCloud()
pcds.points = o3d.utility.Vector3dVector(world_points)

downpcd = pcds.voxel_down_sample(voxel_size=0.01)
downpcd, ind = downpcd.remove_statistical_outlier(nb_neighbors=20, std_ratio=2.0)

But I'm still not getting an ideal result. The depth scale seems off. Did you apply any scale and shift transformations on the predicted depth map?

Feb 20 '25 03:02 MiaApr

If you are loading depth from .png [0-1], then you would need to unscale the values. If you are using pred_depth from model_output[:,4:,...] as in test.py, you would need to unnormalize it from [-1,1] to [0,1]. The Objaverse depths values should range from [0.5, 2.5].

def _unscale_depth( depths):
    '''
    Rescale depth from [0, 1] to [0.5, 2.5]
    '''
    shift = 0.5
    scale = 2.0
    depths = depths * scale + shift
    return depths

def unnormalize(x):
    '''
    Unnormalize [-1, 1] to [0, 1]
    '''
    return torch.clip((x + 1.0) / 2.0, 0.0, 1.0)

Feb 21 '25 07:02 zhizdev