llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

cuda : add half2 __shfl_xor() for ROCm 5.5

Open Engininja2 opened this issue 1 year ago • 0 comments

__shfl_xor() for half2 was added in ROCm 5.6. This PR implements it for HIP versions less than that. Fixes #7242

Engininja2 avatar May 13 '24 18:05 Engininja2