la1ty
la1ty
Agree. I've manually built a CUDA version, but an official prebuilt release should be convenient for most users.
The latest version is v0.3.7. You can follow the steps in the [CI workflow](https://github.com/abetlen/llama-cpp-python/blob/main/.github/workflows/build-wheels-cuda.yaml). For Windows users, here is my two cents: 0. (Optional) Uninstall all MinGW tools (clang, gcc,...
@dw5189 There are two possible causes I guess: 1. Make sure you are using the VS version `cmake.exe` to compile this project. I run `cmake --version` in Powershell and it...
I've successfully built a CUDA wheel. Your environment seems correct. I did not meet this error when I built this package. Here are some suggestions that maybe useful: 1. Try...
Probably a duplicate of #1917 . You can also try [my advice on building this wheel](https://github.com/abetlen/llama-cpp-python/issues/1925).
Did you forget to copy the four files from CUDA MSBuildExtensions directory to VS BuildCustomizations directory after installation? `everything` should be useful to locate them.
Yes, minicpm-o-2.6 works with the minicpm-v-2.6 chat handler. But Qwen2-VL seems does not work with any existing chat handler. I try to use the example chat template from [llama.cpp](https://github.com/ggerganov/llama.cpp/pull/11642) but...
@samkoesnadi I downloaded them from [HuggingFace](https://huggingface.co/bartowski/Qwen2-VL-7B-Instruct-GGUF/). Hope you have some good news.
@kseyhan Yes, that's what I exactly experienced. And I don't know if I make errors in compiling, but I found that text responses generating by Qwen2-VL-7b with llama-cpp-python v0.3.7 are...
Comments that may be off the topic: ~~I tested Qwen2.5-VL-7B in several use cases and it seemed that it didn't perform better than MiniCPM-O-2.6. If you want to build a...