CUDA.jl
CUDA.jl copied to clipboard
CUDA programming in Julia.
Slow 2D sum
Coming from this [observation #issue 1323](https://github.com/JuliaGPU/CUDA.jl/issues/1323). Having a code: ``` using BenchmarkTools using CUDA function mysum(X,Y,n) I = (blockIdx().x - 1) * blockDim().x + threadIdx().x I > n && return...
In newer NVIDIA RT series products, a new RT Core is installed in the GPU. By using this new RT Core, a fast BVH and intersection check could be come...
**Is your feature request related to a problem? Please describe.** I would like to be able to compute the eigenvectors and eigenvalues a matrix which can either hold floats or...
The concept of memory pinning is discussed in the “Tasks and threads” section of the documentation (in `multitasking.md`), but `Mem.pin` is only shown as part of a larger example. It...
**Is your feature request related to a problem? Please describe.** I would like to be able to use [LinearAlgebra's 3 arguments `dot` ](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.dot-Tuple{Any,%20Any,%20Any})function to be able to do compute the...
The current example calls `Array` in the `@async` block, which is implicitly synchronizing. The docs should probably explain how that works, and that you want to use an explicit `synchronize()`...
When I set all GPU variables to nothing and call CUDA.reclaim(), my GPU memory remains full (does not go back to initial usage). Currently the models being loaded onto the...
The `plan_*fft` functions in `AbstractFFTs` take keyword arguments, but the methods of these functions provided by `CUDA.CUFFT` do not. Code that passes keyword arguments to these functions, e.g. to influence...