stduhpf
stduhpf
By removing references to cuda and changing the torch backend from "nccl" to "gloo" just like in the fork by [markasoftware](https://github.com/markasoftware/llama-cpu), I got the 7B model to work fine on...
After loading a Gemma model to Vram using the Vulkan backend, the Vram usage suddenly doubles, even at vey low context sizes. Here is the vram usage when loading the...
When using `llava-cli` or the multimodal mode of the `server` on the vulkan backend, the language model generates gibberish output, regardless of the number of layers offloaded to GPU. ~I...
Hi, I think this tool you made is awesome, and I really enjoy playing with it. My only "complaint" is that when using a grayscale image as an aperture, the...
fixes https://github.com/PotatoSpudowski/fastLLaMa/issues/84. Should make the user experience much better when acessing the webui from a smartphone or a vertical monitor.
Webui is almost unusable on mobile because of the side bar with the list of saves taking most of space, leaving only a tiny portion of the screen for the...
Llama.cpp somewhat recently added support of openCL acceleration, enabling hardware-acelleration on AMD GPUs. Could it be possible to do the same thing?
For example, it still uses old syntax for `convert-pth-to-ggml.py`, and for `export-from-huggingface.py`. Also maybe it would be better to clarify that even when installing through pip, we still need to...
```sh $: python examples/python/example-alpaca.py Traceback (most recent call last): File "examples/python/example-alpaca.py", line 1, in from fastllama import Model ImportError: cannot import name 'Model' from 'fastllama' (unknown location) $: pip install...
### What happened? When trying to build the lastest version of the Vulkan backend, the shader compilation fails. I suspect commit 17eb6aa8a992cda37ee65cf848d9289bd6cad860 to have introduced the issue, but more testing...