WIP: Introducing `cuda.py`: pythonic access to CUDA core functionalities
xref: #70
This is a highly experimental feature currently for preview purposes only, not for production use.
The current focus is centering around a correct, robust, and future-proof design of pythonic CUDA APIs, aiming to boost Python CUDA developers' productivity, among other things. This PR can be considered as a reference/prototype implementation for correctness checks and as a playground, with performance improvements coming next once the design is converged and approved.
More to come in the near future!
TODOs before merging (into the cuda_py branch just forked from main):
- [ ] Expand/update PR description
- [ ] Add README
- [ ] Sync up with the (internal) design doc
- [ ] Polish & check in test files
- [ ] Polish & check in example files
- [x] Update DLpack support
- [ ] Rerun tests with CUDA 11 driver & CTK
- [ ] ~~Fix (?) dependencies on the Python bindings~~
- [x] Add
GPUMemoryViewprototype - [ ] ~~Add documentation~~
A separate issue will be created to track needed changes after merging (but before a beta release).
This pull request requires additional validation before any workflows can run on NVIDIA's runners.
Pull request vetters can view their responsibilities here.
Contributors can view more details about this message here.
Update: To unblock parallel developments I will merge this PR soon (~tomorrow) so that subsequent PRs can start targeting the main branch of this repo, after checking in the following changes:
- ~~
GPUMemoryView->StridedMemoryView~~ (done in commit a41a4b7) - Add a sample code for
StridedMemoryView
Merging! cc @ksimpson-work