Andreas Kieslinger

Results 1 issues of Andreas Kieslinger

Hi all, this PR takes the ideas applied to the vulkan backend (#9118 and #10499) and implements them for CUDA. This results in improved tokens per second performance. ### Performance...

Nvidia GPU
ggml