AMDGPU.jl
AMDGPU.jl copied to clipboard
rccl wrappers
Similar to https://github.com/JuliaGPU/NCCL.jl/, it would be great to have wrappers for RCCL for training DL models on multiple AMDGPUs.
(I know MPI.jl has ROCm support but we don't ship JLLs that are rocm aware, so it might be good to ship RCCL_jll that automatically allows rocm aware communication without copying to CPU)
+1. I was looking at Lux.jl and it's a shame that they have native support for NCCL but not for RCCL.