MoonRide303
MoonRide303
Training LoRAs doesn't require a lot of computing power, and can be done on desktop PCs with single modern GPU. Python, PyTorch and CUDA officially support Windows, and popular tools...
It would be nice to have standardized prompting metadata defined within GGUF files. Currently when importing GGUF model to tools like Ollama it's necessary to explicitely provide prompting metadata -...
I've just checked the installer (v1.7.8) on VirutTotal, and got 4 detections as trojan. Could you guys please make sure it stays clean, and not use any shady practices? RWKV...
**Describe the bug** Running model from a GGUF file using [llama.cpp](https://github.com/ggerganov/llama.cpp) is very straightforward, just like that: `server -v -ngl 99 -m Phi-3-mini-4k-instruct-Q6_K.gguf` and if model is supported, it just...
### Name and Version .\llama-cli.exe --version version: 4491 (c67cc983) built with MSVC 19.39.33523.0 for x64 ### Operating systems Windows ### Which llama.cpp modules do you know to be affected? Python/Bash...
### Name and Version llama-cli --version version: 4713 (a4f011e8) built with MSVC 19.42.34436.0 for x64 ### Operating systems Windows ### Which llama.cpp modules do you know to be affected? llama-server...
JSON files must be UTF-8 encoded, see [8.1 in RFC 8259](https://www.rfc-editor.org/rfc/rfc8259#section-8.1).
### Prerequisites - [x] I am running the latest code. Mention the version if possible as well. - [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md). - [x] I searched using keywords...
GGUF (llama.cpp) is a very popular format used for handling quantized models - would be nice to see evaluation of that, and also quants like Q6_K (0.16% PPL difference vs...