RayOutOfMemoryError: More than 95% of the memory on node xxx is used
Hello, thank you so much for the reproduction code!
I have encountered RayOutOfMemoryError when running the code. To address this issue, I have tried set num_agents from the default of 16 to 8 and 4 respectively, but this address remains unsolved. I don't know if the other parameters (such as num_rollout and num_arms) should be changed together.
My machine is a Linux server with 128GB of RAM and 4 2080-Ti GPUs. Could you please show me how to configure the parameters appropriately?
Looking forward to your reply, thanks!
I have tried setting num_agents, num_rollout, and num_arms to 2, 1, and 4, respectively. The RayOutOfMemoryError still remains. Why does the memory usage keep rising?