Depth image renderer
Description
TODO: Thank you for opening a PR! Please summarize the changes in 1–2 sentences.
Thanks a lot to @Divelix for suggesting and implementing this feature!
This PR should only be reviewed and merged after PR #88.
Type of change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that causes existing functionality to not work as expected)
- [ ] Other (please describe):
Detailed Summary
TODO: Provide the motivation, context, and links to any related issues, PRs, or documentation:
- Motivation: Why is this change necessary?
- Context: How does it fit into wavemap's functionality?
- Related issues/PRs: Fixes # (issue) / Links to other PRs
API Changes
TODO: List any changes to wavemap's APIs to help users update their code. Write "None" if there are no changes.
C++ API:
Python API:
ROS1 Interface:
Review Notes
TODO: Is there anything specific the reviewers should focus on, or are there unresolved questions? Mention them here.
Testing
Automated Tests
TODO: Have you added or modified unit tests to verify these changes? If not, let us know if you'd like assistance.
Manual Tests
TODO: If manual tests were performed to verify these changes, describe them here and include instructions to reproduce them. Describe test configurations where applicable.
System information (optional):
- CPU: [e.g., Intel i9-9900K]
- GPU: [e.g., Nvidia RTX 2080Ti]
- RAM: [e.g., 32GB]
- OS: [e.g., Ubuntu 20.04]
- API: [e.g., C++, Python, ROS1]
- Installation: [e.g., pre-built Docker, local CMake, Pip, catkin]
Runtime information (optional):
- Launch file: [e.g., Link or GitHub Gist]
- Config file: [e.g., Link or GitHub Gist]
- Dataset name (if public): [e.g., Newer College Cloister]
- Custom setup (for private datasets, or live usage):
- Depth sensor: [e.g., Livox MID360 LiDAR]
- Pose source: [e.g., Odometry from FastLIO2]
For performance or accuracy-related changes, include the above system and runtime information and describe:
-
Performance (optional)
- Measured operation: [e.g. serializing the map, performing 1M queries, processing dataset X]
- Metrics [e.g., CPU time, wall time, total RAM usage]
-
Accuracy (optional)
- Metrics: [e.g., AUC, accuracy, recall]
-
Summary of changes
- What metrics improved and by how much?
- Did any metrics worsen?
Benchmarks (To be completed by maintainers)
TODO: We will rerun wavemap's benchmarks and report the results here to validate there are no general performance/accuracy regressions.
Checklist
General
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have added or updated tests as required
- [ ] Any required changes in dependencies have been committed and pushed
Documentation (where applicable)
- [ ] I have updated the installation instructions (in docs/pages/installation)
- [ ] I have updated the code's inline API documentation (e.g., docstrings)
- [ ] I have updated the parameter documentation (in docs/pages/parameters)
- [ ] I have updated/extended the tutorials (in docs/pages/tutorials)
Hi Victor,
I ran into some troubles using the raycasting: Here is a minimal script explaining what I tried to do:
import wavemap as wm
import numpay as np
W, H = 1920, 1280
# rot_so3, trans, K = ...
your_map = wm.Map.load("path/to/your_map.wvmp")
renderer = wm.RaycastingRenderer(
your_map,
wm.Projector.create(
{
"type": "pinhole_camera_projector",
"width": W,
"height": H,
"fx": K[0, 0],
"fy": K[1, 1],
"cx": K[0, 2],
"cy": K[1, 2],
}
),
0.1,
150,
-1,
)
pose = wm.Pose(wm.Rotation(rot_so3), trans)
undistorted_depth_image = renderer.render(pose).data
import open3d as o3d # version 0.19.0
depth = o3d.t.geometry.Image(o3d.core.Tensor(undistorted_depth_image[::-1, ::], dtype=o3d.core.Dtype.Float32))
intrinsic = o3d.camera.PinholeCameraIntrinsic()
intrinsic.set_intrinsics(
width=W,
height=H,
fx=K[0, 0],
fy=K[1, 1],
cx=K[0, 2],
cy=K[1, 0],
)
intrinsic_matrix = o3d.core.Tensor(intrinsic.intrinsic_matrix, o3d.core.Dtype.Float64)
extrinsic = np.eye(4)
extrinsic[:3, :3] = rot_so3
extrinsic[:3, 3] = trans
pcd = o3d.t.geometry.PointCloud.create_from_depth_image(
depth,
intrinsic_matrix,
o3d.core.Tensor(extrinsic, dtype=o3d.core.Dtype.Float64),
depth_scale=1,
depth_max=150,
).to_legacy()
# Visualize the point cloud
o3d.visualization.draw_geometries([pcd])
I used the following Wavemap
Target Image to Create Depthmap (sets K,H,W,rot,trans)
Projection Points on Image is perfect
The metric depth looked in this viewpoint slightly off.
Camera View:
I found that the depth is slightly distorted. specifically here on the right of the scene the ground should be fully flat but bends a lot.
Therefore, I tried to render an additional viewpoints (one exactly aligning with the camera and one 2m above the camera facing down on the ground).
3D-Person View - Ground
Sideview - Ground
Here we can see a clear distortion of the returned depth values. Any idea how this can happen?
Some further checks that I did:
- I tried using matlab to project the pointcloud (same result so not an open3d error)
- Moving camera further away from the ground it "gets worse"
Hi @JonasFrey96,
Thanks for reporting this issue and providing a detailed description!
The bug occurs because we always return a range image, but for depth cameras, we need a depth image. The code to generate both is almost the same, but for depth images we should return the endpoint’s Z-coordinate instead of the Euclidean distance along the ray. At the image center, these values are identical, but they diverge toward the edges -- like a parabola.
Could you please test changing lines 41–46 in raycasting_renderer.cc to directly use the lower-level raycast::first_collision_index function, computing depth in sensor frame as follows?
const auto colliding_index = first_collision_index<MapT>(
map, W_start_point, W_end_point, log_odds_occupancy_threshold);
if (colliding_index) {
const FloatingPoint min_cell_width = map.getMinCellWidth();
const Point3D W_voxel_center =
convert::indexToCenterPoint(colliding_index.value(), min_cell_width);
const Point3D C_voxel_center = T_W_C.inverse() * W_voxel_center;
depth_pixel = projection_model_->cartesianToSensorZ(C_voxel_center);
}
Hi Victor,
Big thanks this fixed everything:
Here is the correct code snippet that runs:
const auto colliding_index = raycast::first_collision_index<const MapT>(
map, W_start_point, W_end_point, log_odds_occupancy_threshold_);
if (colliding_index) {
const FloatingPoint min_cell_width = map.getMinCellWidth();
const Point3D W_voxel_center =
convert::indexToCenterPoint(colliding_index.value(), min_cell_width);
const Point3D C_voxel_center = T_W_C.inverse() * W_voxel_center;
depth_pixel = projection_model_->cartesianToSensorZ(C_voxel_center);
}