cuda-python
cuda-python copied to clipboard
CUDA Python: Performance meets Productivity
After https://github.com/NVIDIA/cuda-python/pull/1271, we have to xfail a number of cufile tests in CI that run on an ext4 filesystem even though that is nominally supported. It would be good to...
This PR should not be merged. This is a notebook that explores the P95 and P99 values of a few review metrics. The purpose of putting it in a PR...
Follow-up of https://github.com/NVIDIA/cuda-python/pull/1216. We currently test the NVRTC path, but `cuda.core.Program` also covers libNVVM (and nvJitLink!) and we should get them tested too.
Support for locating these needs to be added: ``` rwgk-win11.localdomain:/usr/local/cuda-13.0 $ find . -name libdevice.10.bc -o -name nvdisasm -o -name cuobjdump -o -name libcudadevrt.a ./nvvm/libdevice/libdevice.10.bc ./targets/x86_64-linux/lib/libcudadevrt.a ./bin/cuobjdump ./bin/nvdisasm ``` ```...
xref: https://github.com/NVIDIA/cuda-python/pull/1284#discussion_r2558243942 These libs are installed as part of the CUDA driver and can be reliably found via the dynamic loader system search. * `"cuda"` * Linux: `libcuda.so.1` * Windows:...
- many cuda.core operations do not actually need an active CUDA context (aka a device that is set to current) - some just need CUDA to be initialized, meaning `cuInit(0)`...
- cuda.core does not allow any side calls to `cudaDeviceReset` or alike that tear down the primary contexts - avoid multiple frees of the same buffer in the child process...
Add pytest-randomly to cuda_pathfinder. Tests are randomized by default. [`pytest-randomly` docs](https://github.com/pytest-dev/pytest-randomly/blob/main/README.rst)
We have examples like this today ```python kernel = module.get_kernel("vectorAdd") ``` which does not tell us what args are expected on the device side, and so when `launch(s, config, kernel,...