Aaron Beier
Aaron Beier
I'm still getting this problem with a B580; compared to vulkan, the processing speed more than triples, but the generation speed drops from 7.3 t/s to 5.0 t/s with `-nkvo`....
> If there is integrated GPU (iGPU) in CPU, llama.cpp SYCL backend will use both iGPU & dGPU to load the LLM and run on them. It will support more...
FWIW its pretty easy (just kind of a hassle) to use `BencodeParser` instead of `TorrentParser` to parse torrent files and grab the values you need yourself. You can easily detect...
forgot to mention that this comes from https://github.com/ggml-org/llama.cpp/issues/11044#issuecomment-3392718716