DaggerGPU.jl
DaggerGPU.jl copied to clipboard
Add optimized move for ROCm/AMDGPU
We have these for CUDA, but we should also have these for ROCm. The APIs are likely nearly identical, so this should be easy.