Luo Bo

Results 2 issues of Luo Bo

I wonder if there is any chance that Triton will support [`red`](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-red) instruction to do reduction on global memory? The reason why this is superior to existing atomic operations (i.e.,...