ChatSim ChatSim and Blender rendering with my own scene data

Hi, I met some problems in blending background rgb and 3d assets. It is successful when I use your provided scene data to add a traffic cone on the ground. However, the traffic cone can not be added onto the ground (setting z to 0) when I use my scene data.

The config likes this:

cars:
- blender_file: blend_3d_assets/data_assets/Traffic_cone.blend
  insert_pos:
  - 10.459
  - -0.0068
  - 2
  insert_rot:
  - 0
  - 0
  - -0.0045
  model_obj_name: Car
  new_obj_name: cone2
- blender_file: blend_3d_assets/data_assets/Traffic_cone.blend
  insert_pos:
  - 10.459
  - -0.0068
  - 0
  insert_rot:
  - 0
  - 0
  - -0.0045
  model_obj_name: Car
  new_obj_name: cone3

The cam2world matrix is：

[[ 0.0112684  -0.02057352  0.99972486  1.9004364 ]
 [-0.9999064   0.00751276  0.01142506 -0.03219445]
 [-0.00774574 -0.9997601  -0.02048694  1.4025325 ]
 [ 0.          0.          0.          1.        ]]

I want to know whether there are some hard codes in blender python scripts for waymo setting, and how can I debug the script in vscode debugger? Thanks in advance. Best wishes.

Apr 10 '24 14:04 szhang963

This is because your world coordinate is not aligning with the road surface.

In Waymo's data process, we use the vehicle coordinate (red circle in the figure) from the first frame as the world origin, and the plane of its x-y-axis coordinate axis fits the ground.

Apr 10 '24 16:04 yifanlu0227

If you have a PC with display, you can remove the -b option in blender -b --python blender_utils/main_multicar.py -- config/1346_multi_car_demo.yaml.

It will reproduce all the steps with opening a Blender GUI. You can easily debug there.

Apr 11 '24 00:04 yifanlu0227

This is because your world coordinate is not aligning with the road surface.

In Waymo's data process, we use the vehicle coordinate (red circle in the figure) from the first frame as the world origin, and the plane of its x-y-axis coordinate axis fits the ground.

Thanks for your reply. How can I check the world coordinate align with the ground？ Is the world coordinate set as ‘extrinsic’ in scene.npz？

Apr 11 '24 06:04 szhang963

Hi, I have aligned the ground with the world coordination. However, I have a new question about the rendering of depth for 3d assets. It causes the asset not to align with the ground strictly, particularly composite it with the depth of background.

I converted the depth of the asset (insert location z 0) to points cloud by camera intrinsic (cx, cy is equal to the half of w,h) and extrinsic (cam2world). Then, I found the location z of the asset is not equal to 0. As you can see, the location z at the bottom of cone has a large margin.

Could you provide me with some help to check the reason? Thanks in advance.

Apr 19 '24 12:04 szhang963

What is the coordinate of your point cloud in the figure? If they are in your camera coordinate, the z value at the bottom of the cone is usually not 0. To be specific, z will be zero in the Blender's (and your scene's) world coordinate.

ChatSim only changes the x ,y position of the asset, and they should always align with the ground.

Apr 19 '24 12:04 yifanlu0227

Hi, it should be the world coordination, and I used your provided data (data_assets/scene-demo-1137/0.npz) to convert the cone depth to points cloud in world coordination. You can check it, and refer to my codes.

Codes

    blender_depth_path = os.path.join(blender_out_path, "depth")
    blender_depth = os.path.join(blender_depth_path, "vehicle_and_plane0001.exr")
    assert os.path.exists(blender_depth), f"File no found: {blender_depth}"
    depth_image = cv2.imread(blender_depth, cv2.IMREAD_ANYDEPTH)
    depth_image = np.clip(depth_image, 0, 500)
    depth_map = depth_image
    scene_path = os.path.join(input_path, f"{frame_id:05d}.npz")
    scene_info = np.load(scene_path)
    c2w = scene_info['extrinsic']
    fx = fy = scene_info['focal'].item()
    cx, cy = scene_info['W'] / 2, scene_info['H'] / 2
    K = get_intrinsic(fx, fy, cx, cy)
    pcd_map, mask = get_pcd_from_depth(depth_map, os.path.join(blender_depth_path, "vehicle_and_plane0001.npy"), K, c2w, save_pcd=True)

def get_intrinsic(focal_u, focal_v, optical_center_x, optical_center_y):
    K = np.array([
        focal_u, 0, optical_center_x, 0, focal_v,
        optical_center_y, 0, 0, 1
    ]).reshape(-1, 3)
    return K

def get_pcd_from_depth(depth, out_path, K, c2w=None, save_pcd=False):
    H, W = depth.shape
    u, v = np.meshgrid(np.arange(W), np.arange(H))
    uv_points = np.vstack((u.flatten(), v.flatten(), np.ones_like(u.flatten())))
    normalized_plane_coordinates = np.linalg.inv(K) @ uv_points
    
    X_norm = normalized_plane_coordinates[0, :]
    Y_norm = normalized_plane_coordinates[1, :]
    Z_norm = depth.flatten() 
    
    X = Z_norm * X_norm
    Y = Z_norm * Y_norm
    Z = depth.flatten()
    
    pcd_map_cam = np.stack((X, Y, Z), axis=-1)
    if c2w is not None:
        num_point = len(pcd_map_cam)
        pcd_map_cam = np.hstack((pcd_map_cam, np.ones((num_point, 1))))
        pcd_map_w = np.matmul(c2w[:3], pcd_map_cam.T).T
        pcd_map = pcd_map_w.reshape(H, W, 3)
        mask = (pcd_map[..., 0] <= 1000) & (pcd_map[..., 0] > 0)
    else:
        pcd_map = pcd_map_cam.reshape(H, W, 3)
        mask = (pcd_map[..., 2] <= 1000) & (pcd_map[..., 2] > 0)
    pcd_map[~mask] = np.array([0,0,0],dtype=np.float64)

    if save_pcd:
        points = pcd_map[mask]
        df = pd.DataFrame(points, columns=["x", "y", "z"])
        cloud = PyntCloud(df)
        cloud.to_file(out_path.replace('.npy', '_w2.ply'))
    return pcd_map, mask

Meanwhile, I found the z value of ground plane in Blender is also not equal to 0 as shown in the figure. I think it may have some deviation (e.g. camera intrinsic) in the depth rendering of Blender. Could you provide some help? Thank you very much.

Apr 20 '24 11:04 szhang963

The output depth may be (x^2 + y^2 + z^2) ^ 0.5

In other words, the depth measures the absolute distance between points and camera origin. And this causes the bug.

I will check it further.

Apr 22 '24 03:04 yifanlu0227

I think my guess is correct. We enable the 'z-pass' here and it will create an output Depth socket for Render Layers Node. The saved depth.exr is the absolute distance to the object's surface.

https://docs.blender.org/manual/en/3.5/render/layers/passes.html#cycles

https://blender.stackexchange.com/questions/52328/render-depth-maps-with-world-space-z-distance-with-respect-the-camera

Apr 22 '24 03:04 yifanlu0227

It seems that we can‘t get real points in the world from the depth using camera intrinsic and extrinsic. Can it add a scale value and align with the point in the real world to refine the depth？

Apr 22 '24 06:04 szhang963

Another solution, can we use camera projection to get the depth directly, or convert the absolute distance to vertical depth?

Apr 22 '24 07:04 szhang963

Of course. You know your camera ray's direction vector. You can easily obtain the z-axis value

Apr 22 '24 07:04 yifanlu0227