CuArrays.jl issues

Gamma family function support

16

Porting codes from CuGammaFuns.jll; see https://github.com/xukai92/CuGammaFuns.jl/issues/1 Currently I - add a new folder called `special` and put `gamma.jl` inside - defines the new diff. rules in `forwarddiff.jl` - link implemented...

xukai92

enhancement

Add correct batch size for RNN hidden layer

12

Fixes https://github.com/FluxML/Flux.jl/issues/1114 The context here is that on the first call to the layer (also simulated via calling the `Flux.reset!` on the structure), the gradients for the hidden layer were...

DhairyaLGandhi

needs tests

bugfix

Use unified memory for array allocations.

Massively speeds up host operations, without losing `assertscalar`-like functionality. Doesn't properly works right now, due to the coherency requirements of unified memory on pre-sm_60 hardware. Basically, we'd need to synchronize...

maleadt

performance

Upsampling GPU Kernel for Flux

5

## Current Issues: - [ ] Line 47 in upsample.jl is not a safe operation without atomic addition. It results in incorrect and inconsistent answers in the backward pass. @maleadt...

avik-pal

enhancement

needs tests

Add deterministic option

6

Some of the commonly used kernels (conv and maxpool) are not deterministic by default. This hurts reproducibility a lot - when fix random seed (e.g. by `seed!(1)`) users cannot reproduce...

xukai92

needs tests

needs decision

Include semi-documented contraction algo options

2

In fact, there are more algorithms than we include now: https://docs.nvidia.com/cuda/cutensor/api/types.html#cutensoralgo-t `values >= 0 correspond to certain sub-algorithms of GETT` I checked and currently the maximum number of algorithms function...

kshyatt

upstream

fix sparse adjoint mul

1

as reported in https://github.com/JuliaGPU/CuArrays.jl/issues/629 one could still add an additional test.

maximilian-gelbrecht

needs tests

bugfix

Value function iteration on the gpu tutorial

2

hi there, I really enjoyed the tutorial on starting out with GPU and I thought I'd give it a go for a very specific use case. value function iteration is...

floswald

Create a CUDA context

6

Thanks to `IRTools.jl`, we can do some nifty things with Julia IR. Like using a `dynamo` to walk through the deep IR and offload sensible ops to the GPU. ```julia...

DhairyaLGandhi

enhancement

Upstream ldiv! overload

11

DiffEqBase.jl has been carrying an ldiv! overload to make it work for awhile (https://github.com/JuliaDiffEq/DiffEqBase.jl/blob/master/src/init.jl#L148-L152), and I think it might be a good time to upstream it.

ChrisRackauckas

enhancement

CuArrays.jl
CuArrays.jl copied to clipboard

Metadata

Gamma family function support

Add correct batch size for RNN hidden layer

Use unified memory for array allocations.

Upsampling GPU Kernel for Flux

Add deterministic option

Include semi-documented contraction algo options

fix sparse adjoint mul

Value function iteration on the gpu tutorial

Create a CUDA context

Upstream ldiv! overload

← Metadata

Owner

Metadata

CuArrays.jl CuArrays.jl copied to clipboard

Metadata

← Metadata

Owner

Metadata

CuArrays.jl
CuArrays.jl copied to clipboard