Choose which GPU to use on a multi-GPU system
When you import gpu4pyscf, it creates CUDA streams on all visible CUDA devices.
https://github.com/pyscf/gpu4pyscf/blob/f805329f11dcda25c27e1eedbb1bbb1890e45327/gpu4pyscf/config.py#L20-L23
There are cases where this is undesirable, for example when a multiple copies of a program are launched with MPI on one node, and each copy is meant to access only one device.
How can we make this configurable? Using CUDA_VISIBLE_DEVICES is not an option for me, because the same process that uses gpu4pyscf also uses other GPUs for other purposes.
@tvogels Thank you for raising the issue. The multi-GPU feature was designed to use all the available GPUs. A configurable device list, or allowing user to turn off multi-GPU indeed introduces the flexibility of controlling GPUs. I will label this as feature request first.
Alternatively, you can launch a subprocess in python, and use CUDA_VISIBLE_DEVICES to control the execution of gpu4pyscf task. It probably does not help, if there are other limitations.
Hi @wxj6000, thanks! Yeah that makes sense.
I’d be happy to contribute this feature, but I don’t see an easy way to do this while maintaining compatibility with the current behavior. Let me know if you have a good idea.
Here is the simplest solution I can come up with. We can introduce a list called active_device_ids: a list of active device id's
Then, replace all the
for device_id in range(num_devices):
with
for device_id in active_device_ids
The default active_device_ids = range(num_devices). If needed, people can modify the list in-place.
Hmm, but this code runs at import time. How would you set the list of active devices so early?
Hmm, but this code runs at import time. How would you set the list of active devices so early?
Yes, the list of active devices will be in RAM when imported. The values can be updated later with a reference of active_device_ids.