[Question] Why does camera.take_picture() take too long?
Hi, I am using SAPIEN to collect robot data, and I have encountered an issue where the camera.take_picture() function takes around 0.135921 seconds to execute each time, which significantly slows down my program. I am using the following setup for the SAPIEN simulation environment:
def setup_scene(self,
timestep: float = 1 / 150,
ground_height: float = 0.,
static_friction: float = 0.5, dynamic_friction: float = 0.5, restitution: float = 0.,
ambient_light: list[float] = [0.5, 0.5, 0.5],
shadow: bool = True,
direction_lights: list[list[float]] = [[[0, 0.5, -1], [0.5, 0.5, 0.5]]],
point_lights: list = [[[1, 0, 1.8], [1, 1, 1]], [[-1, 0, 1.8], [1, 1, 1]]],
camera_xyz: list[float] = [0.4, 0.22, 1.5], camera_rpy: list[float] = [0, -0.8, 2.45],):
'''
Set the scene
- Set up the basic scene: light source, viewer.
'''
self.engine = sapien.Engine()
# declare sapien renderer
from sapien.render import set_global_config
set_global_config(max_num_materials = 50000, max_num_textures = 50000)
self.renderer = sapien.SapienRenderer()
# give renderer to sapien sim
self.engine.set_renderer(self.renderer)
sapien.render.set_camera_shader_dir("rt")
sapien.render.set_ray_tracing_samples_per_pixel(32)
sapien.render.set_ray_tracing_path_depth(8)
sapien.render.set_ray_tracing_denoiser("oidn")
# declare sapien scene
scene_config = sapien.SceneConfig()
self.scene = self.engine.create_scene(scene_config)
# set simulation timestep
self.scene.set_timestep(timestep)
# add ground to scene
self.scene.add_ground(ground_height)
# set default physical material
self.scene.default_physical_material = self.scene.create_physical_material(
static_friction,
dynamic_friction,
restitution,
)
# give some white ambient light of moderate intensity
self.scene.set_ambient_light(ambient_light)
# default spotlight angle and intensity
for direction_light in direction_lights:
self.scene.add_directional_light(
direction_light[0], direction_light[1], shadow=shadow
)
# default point lights position and intensity
for point_light in point_lights:
self.scene.add_point_light(point_light[0], point_light[1], shadow=shadow)
# initialize viewer with camera position and orientation
if self._render:
self.viewer = Viewer(self.renderer)
self.viewer.set_scene(self.scene)
self.viewer.set_camera_xyz(
x=camera_xyz[0],
y=camera_xyz[1],
z=camera_xyz[2],
)
self.viewer.set_camera_rpy(
r=camera_rpy[0],
p=camera_rpy[1],
y=camera_rpy[2],
)
I have defined multiple cameras using the following code:
def load_camera(self):
'''
Add cameras and set camera parameters
'''
self.sensor_cameras = dict()
camera_top = self.scene.add_camera(
name="camera_top",
width=SENSOR_CAMERA_WIDTH, height=SENSOR_CAMERA_HEIGHT,
fovy=np.deg2rad(SENSOR_CAMERA_FOVY),
near=SENSOR_CAMERA_NEAR, far=SENSOR_CAMERA_FAR)
camera_top.entity.set_pose(rand_pose(position_reference=[0, 0, 1.5],
x_limit=[-0.01, 0.01], y_limit=[-0.01, 0.01], z_limit=[-0.01, 0.01],
rotation_reference=[0., np.pi / 2, np.pi / 2],
euler_angles_limit=[np.pi / 36, np.pi / 36, np.pi / 36]))
self.sensor_cameras["camera_top"] = camera_top
camera_left = self.scene.add_camera(
name="camera_left",
width=SENSOR_CAMERA_WIDTH, height=SENSOR_CAMERA_HEIGHT,
fovy=np.deg2rad(SENSOR_CAMERA_FOVY),
near=SENSOR_CAMERA_NEAR, far=SENSOR_CAMERA_FAR)
camera_left.entity.set_pose(rand_pose(position_reference=[TABLE_LENGTH / 2, 0, TABLE_HEIGHT + 0.075],
x_limit=[-0.01, 0.01], y_limit=[-0.01, 0.01], z_limit=[-0.01, 0.01],
rotation_reference=[0., 0., np.pi],
euler_angles_limit=[np.pi / 36, np.pi / 36, np.pi / 36]))
self.sensor_cameras["camera_left"] = camera_left
camera_wrist = self.scene.add_mounted_camera(
name="camera_wrist",
mount=self.robot_end_effector_link.entity,
pose=rand_pose(position_reference=[0.05, 0, 0.025,],
x_limit=[-0.01, 0.01], y_limit=[-0.01, 0.01], z_limit=[-0.01, 0.01],
rotation_reference=[0., -np.pi/2, np.pi],
euler_angles_limit=[np.pi / 36, np.pi / 36, np.pi / 36]),
width=SENSOR_CAMERA_WIDTH, height=SENSOR_CAMERA_HEIGHT,
fovy=np.deg2rad(SENSOR_CAMERA_FOVY),
near=SENSOR_CAMERA_NEAR, far=SENSOR_CAMERA_FAR
)
self.sensor_cameras["camera_wrist"] = camera_wrist
What could be causing the significant delay in the program, and are there any ways to speed it up? For example, is it possible to configure the camera to selectively capture only certain data? For my code, object surface normals and segmentation results are not needed.
From your code, it seems you are using ray tracing with 32 samples per pixel, which could be quite slow depending on your hardware. For example, at 1K resolution, it takes more than 1 second on my low-end integrated graphics card, so 0.13 second is not surprising. If you need it to run faster, you can switch to rasterization, reduce samples or reduce image resolution.
However, if you are using a high-end GPU such as RTX 4090, this problem may indicate that the GPU is not loaded properly and the renderer may be using integrated graphics or sometimes even CPU to execute. You can find this out by sapien.render.set_log_level("info") and look for the selected graphics device when creating the renderer.
Capturing less data probably will not speed up rendering at all especially for ray tracing. Even for rasterization, not capturing normal and segmentation only starts to make a difference when I render more than 1 billion pixels per second after I optimized everything else to the limit.