cuda-python
cuda-python copied to clipboard
Add peer access support
In a multi-GPU setting, right now it's not possible to memcpy directly between the two devices. A couple of checks would have to be done by cuda.core as part of enabling this:
- cuDeviceCanAccessPeer
- cuCtxEnablePeerAccess
This issue is actually very tricky because the modern way of exposing peer access support is again tied to the mempool/VMM APIs, such as cuMemPoolSetAccess/cuMemSetAccess, same as IPC (#103). We'll need a coherent, unified, and consistent solution here.