Federico Busato
Federico Busato
cuSPARSE SpMV is limited to real and complex data types and doesn't support custom operators. We plan to add JIT LTO to SpMV (similar to SpMMOp) in the future to...
Hi @tlu7, > Shall I assume that all these algorithms are available of cuda 11.2 and onwards. Is there any document that I can find this information? I need the...
There is a small trick that you can use to check old toolkit documentations 😀 [https://developer.nvidia.com/cuda-toolkit-archive](https://developer.nvidia.com/cuda-toolkit-archive) `CUSPARSE_SPMM_CSR_ALG3` and SpMV algorithms have been introduced in CUDA 11.2u1 [https://docs.nvidia.com/cuda/archive/11.2.1/cusparse/index.html#cusparse-generic-function-spmm](https://docs.nvidia.com/cuda/archive/11.2.1/cusparse/index.html#cusparse-generic-function-spmm)
Adding this example has a low priority at the moment. However, if you think this could be useful for other users, your contribution would be greatly appreciated.
Please consider to use `uint32_t` for the storage type if it is allowed by the C++ specification https://github.com/NVIDIA/cccl/blob/main/libcudacxx/include/cuda/std/detail/libcxx/include/bitset#L151. 64-bit operations are less efficient on gpu architectures
our RFE: - `deallocate/deallocate_async` functions should accept `const void*` to skip `const_cast()` on the user side - Allow `cuda::mr::*` functions in device code - Clarify (or fix) the expected behavior...
> Can you elaborate on what you mean? allocate() and deallocate() are expected to always be synchronous. Yes, but what is their purpose if the code uses an `async_resource` with...
ok, I didn't interpret `async_resource` as a superset of the `resource concept`. In this case, can we please just clarify this point on the doc?
I perfectly understand this constraint. It would be nice to add `cuda::complex` type if it is not too much effort.
@alliepiper FYI I opened another issue for host-side sanitizers https://github.com/NVIDIA/cccl/issues/2241