Andreas (Andi) Kunar

Results 52 comments of Andreas (Andi) Kunar

> Sure,[ here](https://huggingface.co/vicuna/ggml-vicuna-13b-1.1) is the link. Seems to be the problem with llama.cpp recently changing file-formats. See https://github.com/imartinez/privateGPT/issues/567#issuecomment-1569991256 which might help to use the model.

@Kangmo, @muxx, @MoonKraken I found a solution for my using llama.cpp in Apple Silicon Linux VMs (and probably also Docker on Apple Silicon) without changing any code. Maybe this also...

Same here, llama7B, llama13B, alpaca,... - all working locally with llama.cpp on the commandline. All hanging on load. Parameters for invoking llama.cpp commandline seem right and commandline status shows apparent...

Updated / totally edited for better clarification. - I'm on macOS/Apple silicon. Running the current/latest llama.cpp with several models from terminal. It all works fine in terminal, even when testing...

Update - I got it to work (most of the time) on my Mac by changing alpaca_turbo.py quite a bit. But I don't think it is mergeable into a pull-request,...

If you redo the connectors.AI.ollama, please also think about providing some basic OpenAIPromptExecutionSettings compatibility for OllamaAIPromptExecutionSettings. Currently e.g. "temperature" with OpenAIPromptExecutionSettings becomes "options"."temperature" with OllamaPromptExecutionSettings. This is e.g. breaking the...

Please note, that on Snapdragon X, moving from [Q4_0 to Q4_0_4_8](https://github.com/ggerganov/llama.cpp/pull/5780) gives a 2-2.5x CPU-inference speed increase! With this, my Snapdragon X Plus (CPU-only) has nearly the same performance as...

Please be aware, that there also should be a change in ggml.go. When trying to import a Q4_0_4_8 quantized GGUF file supported by llama.cpp (see llama.cpp PR#5780) ggml.go reports an...

@dhiltgen I tested your current changes on Windows for ARM on a Surface 11 Pro base model, and they require MinGW, gcc, which are not available for arm. For ARM...

> Windows arm64 tools do seem to be a bit challenging, but I found it via [msys2](https://www.msys2.org/wiki/arm64/) via the `mingw-w64-clang-aarch64-gcc-compat` package > > ``` > get-command gcc > > CommandType...