rabidcopy
rabidcopy
~~Strange, when comparing #775 to this I noticed a regression in the time it took to generate 1024 tokens. #775~~ ``` llama_print_timings: load time = 2776.69 ms llama_print_timings: sample time...
> Alright, #775 clearly contributed to the results I got. I pulled master again with #775 already merged and now I'm getting: > > ``` > llama_print_timings: load time =...
Yeah, no noticeable difference on a Ryzen 2600. But interesting if it can go somewhere.
I would bring up CLBlast as it's been implemented over at https://github.com/LostRuins/koboldcpp/ and isn't Nvidia-exclusive, but from my experience, speed ups are minor or just ends up being slower than...
> > I would bring up CLBlast as it's been implemented over at https://github.com/LostRuins/koboldcpp/ and isn't Nvidia-exclusive, but from my experience, speed ups are minor or just ends up being...
Doing a quick and dirty comparison between llama.cpp with OpenBLAS and koboldcpp with CLBlast. OpenBLAS processing time for dan.txt ``` llama_print_timings: prompt eval time = 32412.01 ms / 399 tokens...
Comparison between latest master with OpenBLAS processing dan.txt versus this PR with CLBlast. OpenBLAS on Ryzen 2600: `llama_print_timings: prompt eval time = 35540.49 ms / 399 tokens ( 89.07 ms...
Wanted to add, it appears OpenCL performance on AMD is actually better with the opencl-mesa package instead of the opencl-amd package on Arch. `llama_print_timings: prompt eval time = 15324.17 ms...
> @rabidcopy Interesting result. I thought the Mesa OpenCL driver wasn't really functional. Do you know which hardware is supported? Or did you use the new rusticl already? No idea...
Has anyone compared speeds between Clover and rusticd OpenCL? Apparently rusticd OpenCL is getting merged into Mesa [soon](https://www.phoronix.com/news/RadeonSI-Rusticl-Mesa-23.1). Kinda curious if it would be worth going through the trouble to...