Cédric Augonnet
Cédric Augonnet
This PR introduces a new version of the MiniWeather benchmark based on the CUDASTF programming model. CUDASTF is shipped in NVIDIA's CCCL project, and implements task-parallelism as a C++ header...
## Description Experiments on top of the stackable PR (not to break code using the branch), do not merge. closes ## Checklist - [ ] New or existing tests cover...
## Description This introduces helper methods to improve how we nest contexts to better leverage CUDA Graphs ## Checklist - [x] New or existing tests cover these changes. - [...
## Description closes ## Checklist - [ ] New or existing tests cover these changes. - [ ] The documentation is up to date with these changes.
## Description If we want to compose context implementations, we may add some concepts to avoid programming mistakes (and maybe specify the interface) closes ## Checklist - [ ] New...
## Description We want to make sure we do not loose the nice properties of mdspan when manipulating thing by the means of logical data facilitie closes ## Checklist -...
### Is this a duplicate? - [x] I confirmed there appear to be no [duplicate issues](https://github.com/NVIDIA/cccl/issues) for this bug and that I agree to the [Code of Conduct](CODE_OF_CONDUCT.md) ### Type...
### PR content/description This PR introduces CUDA graphs by the means of the CUDASTF library TODO * [x] fetch sources from cmake * [ ] add an option in cmake/cargo...
Compute residuals using a reduce access mode rather than a sequential code on the host ### Which kernels are implemented? - [ ] synch_p2p (p2p) - [ ] stencil -...
This is intended to explore how we can use CUDASTF within cuGraph