llama-cpp-python 0.3.8 with CUDA
When will we have a recent version of llama-cpp-python, functional with CUDA, via pip? It's a real nightmare to make it work any other way. 0.3.4 works with CUDA, but it doesn't take into account models like Qwen 3 quantified.
with kind regards
You can try to compile the new code I maintain here: https://github.com/JamePeng/llama-cpp-python, but I only pre-compiled the Windows and linux version based on the recent code
When will we have a recent version of llama-cpp-python, functional with CUDA, via pip?
I probably misunderstand your question, but I am using 0.3.9 with CUDA via pip. I can use Qwen3 related models as well (using Iquants for example or flash attention, is that what you are referring to?). Sorry if my comment is useless.
I built using...
CMAKE_ARGS="-DGGML_CUDA=on -DLLAVA_BUILD=off -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade
Hi, I think it might come from the fact that wheels have not been updated to recent version of llama-cpp-python. You can see that of this link https://abetlen.github.io/llama-cpp-python/whl/cu122/llama-cpp-python/. Last version available is 3.4. Could you add newer wheels @abetlen ?
@m-from-space i have tried your solution and I got this error:
Building wheels for collected packages: llama-cpp-python
Building wheel for llama-cpp-python (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [47 lines of output]
*** scikit-build-core 0.11.3 using CMake 3.22.1 (wheel)
*** Configuring CMake...
loading initial cache file /tmp/tmpovycplbn/build/CMakeInit.txt
-- The C compiler identification is Clang 14.0.0
-- The CXX compiler identification is Clang 14.0.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/clang - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - failed
-- Check for working CXX compiler: /usr/bin/clang++
-- Check for working CXX compiler: /usr/bin/clang++ - broken
CMake Error at /usr/share/cmake-3.22/Modules/CMakeTestCXXCompiler.cmake:62 (message):
The C++ compiler
"/usr/bin/clang++"
is not able to compile a simple test program.
It fails with the following output:
Change Dir: /tmp/tmpovycplbn/build/CMakeFiles/CMakeTmp
Run Build Command(s):ninja cmTC_9090e && [1/2] Building CXX object CMakeFiles/cmTC_9090e.dir/testCXXCompiler.cxx.o
[2/2] Linking CXX executable cmTC_9090e
FAILED: cmTC_9090e
: && /usr/bin/clang++ -pthread CMakeFiles/cmTC_9090e.dir/testCXXCompiler.cxx.o -o cmTC_9090e && :
/usr/bin/ld : ne peut pas trouver -lstdc++ : Aucun fichier ou dossier de ce nom
clang: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.
CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
CMakeLists.txt:3 (project)
-- Configuring incomplete, errors occurred!
See also "/tmp/tmpovycplbn/build/CMakeFiles/CMakeOutput.log".
See also "/tmp/tmpovycplbn/build/CMakeFiles/CMakeError.log".
*** CMake configuration failed
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Failed to build installable wheels for some pyproject.toml based projects (llama-cpp-python)
When will we have a recent version of llama-cpp-python, functional with CUDA, via pip?
I probably misunderstand your question, but I am using 0.3.9 with CUDA via pip. I can use Qwen3 related models as well (using Iquants for example or flash attention, is that what you are referring to?). Sorry if my comment is useless.
I built using...
CMAKE_ARGS="-DGGML_CUDA=on -DLLAVA_BUILD=off -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade
Man... I've tried for hours to get similar commands to work, never succeeded, I used JamePeng's wheel to make up for it. I just tried your command and it works fine. I just don't get it. I don't understand these installations with CMAKE, and how it can work when Ableten hasn't done anything about it.
Man... I've tried for hours to get similar commands to work, never succeeded, I used JamePeng's wheel to make up for it. I just tried your command and it works fine. I just don't get it. I don't understand these installations with CMAKE, and how it can work when Ableten hasn't done anything about it.
I'm glad that it worked for you. As far as I remember, it didn't build on my machine when LLAVA_BUILD isn't turned off, so maybe that's your problem as well.
@m-from-space i have tried your solution and I got this error:
Building wheels for collected packages: llama-cpp-python Building wheel for llama-cpp-python (pyproject.toml) ... error error: subprocess-exited-with-error ... The C++ compiler "/usr/bin/clang++" is not able to compile a simple test program. ...
I am no expert on this, but it looks like your c++ compiler is not set up correctly on your system. It fails to compile a simple test program. I am not using clang++ on my system, but gcc / g++ as the compilers.
@m-from-space i have tried your solution and I got this error:
Building wheels for collected packages: llama-cpp-python Building wheel for llama-cpp-python (pyproject.toml) ... error error: subprocess-exited-with-error ... The C++ compiler "/usr/bin/clang++" is not able to compile a simple test program. ...I am no expert on this, but it looks like your c++ compiler is not set up correctly on your system. It fails to compile a simple test program. I am not using
clang++on my system, butgcc/g++as the compilers.
He's probably running on Windows. On Windows you need to install the C++ compiler via vs_BuildTools.exe
https://aka.ms/vs/17/release/vs_BuildTools.exe