Luca Lavarini
Luca Lavarini
Two codegen errors on CUDA arch: - Copy from SharedMemory to Global Memory does not find suitable constructor (see fun bug1) - Basic operators such as '-' or math library...
Compiling the CUDA SDFG below yields a codegen error (dace::GlobalToGlobal cannot be found) [global_to_global_bug.sdfg.zip](https://github.com/spcl/dace/files/5036189/global_to_global_bug.sdfg.zip) Instead of generating `GlobalToGlobal` in `dace/codegen/targets/cuda.py`, we should raise a `NotImplementedError` that mentions that GPU global...
Expanding a reduce node on a CUDA graph can lead to wrong behaviour, especially when dealing with reduce nodes nested inside another map. 1. If the maps' schedules of the...