Andreas Kieslinger
Results
1
issues of
Andreas Kieslinger
Hi all, this PR takes the ideas applied to the vulkan backend (#9118 and #10499) and implements them for CUDA. This results in improved tokens per second performance. ### Performance...
Nvidia GPU
ggml