Nikola Janjušević

Results 6 issues of Nikola Janjušević

Float16 CUDA `conv` seems to be broken for 5D tensors, but not 3D or 4D tensors. FluxML/Flux.jl#2184 (using Julia 1.8.3 on a A100 GPU.) ```julia julia> conv(rand(Float16, 16, 16, 1,...

bug
CUDA

I'm wondering if you think it's possible to write similar code for cuda version for AD of sparse matrix times dense vector. If so, would it be somewhat straightforward from...

seems the new release (v0.9.18) breaks padding. before: v0.9.17 ```julia julia> using NNlib julia> x = rand(1:9, 3,3,1,1) 3×3×1×1 Array{Int64, 4}: [:, :, 1, 1] = 7 3 2 1...

The NCCL backend in distributed utils does not support complex values [see issue](https://github.com/NVIDIA/nccl/issues/539). Can we add a conveinent wrapper in the NCCLEXT to support broadcast, reduce, etc., likely via using...

enhancement

Addresses #56. Tests added for ComplexF64. Feedback appreciated.

NCCL does not support complex numbers directly and does not plan to ([see issue](https://github.com/NVIDIA/nccl/issues/539)). Are we willing to add a wrapper to NCCL.jl to make using complex numbers more convienient?...