Publish the latest llama.cpp?

Open curvedinf opened this issue 10 months ago • 2 comments

Hello, I run an AMD card and there have been very significant ROCm support updates (flash attention, quants, massive speed improvements) since the llama.cpp version currently in llama-cpp-python.

Could you do us a big one and publish a new llama-cpp-python with the latest llama.cpp? It would be much appreciated! Thank you!

Feb 27 '25 21:02 curvedinf

+1 would love to see an update to the latest llama.cpp

Mar 11 '25 06:03 ekcrisp

Up until a couple of weeks ago, the bindings were still close enough that you could pull the upstream llama.cpp in the vendor directory and build locally. It looks like there's a breaking change in the contract for libllama - llama_model_load_from_file got renamed to llama_load_model_from_file.

Mar 28 '25 13:03 handshape