cuda-python icon indicating copy to clipboard operation
cuda-python copied to clipboard

CUDA Python: Performance meets Productivity

Results 261 cuda-python issues
Sort by recently updated
recently updated
newest added

After https://github.com/NVIDIA/cuda-python/pull/1271, we have to xfail a number of cufile tests in CI that run on an ext4 filesystem even though that is nominally supported. It would be good to...

bug
triage
test
cuda.bindings

This PR should not be merged. This is a notebook that explores the P95 and P99 values of a few review metrics. The purpose of putting it in a PR...

Follow-up of https://github.com/NVIDIA/cuda-python/pull/1216. We currently test the NVRTC path, but `cuda.core.Program` also covers libNVVM (and nvJitLink!) and we should get them tested too.

triage
test
cuda.core

Support for locating these needs to be added: ``` rwgk-win11.localdomain:/usr/local/cuda-13.0 $ find . -name libdevice.10.bc -o -name nvdisasm -o -name cuobjdump -o -name libcudadevrt.a ./nvvm/libdevice/libdevice.10.bc ./targets/x86_64-linux/lib/libcudadevrt.a ./bin/cuobjdump ./bin/nvdisasm ``` ```...

cuda.pathfinder

xref: https://github.com/NVIDIA/cuda-python/pull/1284#discussion_r2558243942 These libs are installed as part of the CUDA driver and can be reliably found via the dynamic loader system search. * `"cuda"` * Linux: `libcuda.so.1` * Windows:...

feature
cuda.pathfinder

- many cuda.core operations do not actually need an active CUDA context (aka a device that is set to current) - some just need CUDA to be initialized, meaning `cuInit(0)`...

documentation
triage
cuda.core

- cuda.core does not allow any side calls to `cudaDeviceReset` or alike that tear down the primary contexts - avoid multiple frees of the same buffer in the child process...

documentation
triage
cuda.core

Add pytest-randomly to cuda_pathfinder. Tests are randomized by default. [`pytest-randomly` docs](https://github.com/pytest-dev/pytest-randomly/blob/main/README.rst)

enhancement
triage
test
cuda.pathfinder

We have examples like this today ```python kernel = module.get_kernel("vectorAdd") ``` which does not tell us what args are expected on the device side, and so when `launch(s, config, kernel,...

triage
feature
cuda.core