MoonRide303

Results 9 issues of MoonRide303

Training LoRAs doesn't require a lot of computing power, and can be done on desktop PCs with single modern GPU. Python, PyTorch and CUDA officially support Windows, and popular tools...

It would be nice to have standardized prompting metadata defined within GGUF files. Currently when importing GGUF model to tools like Ollama it's necessary to explicitely provide prompting metadata -...

I've just checked the installer (v1.7.8) on VirutTotal, and got 4 detections as trojan. Could you guys please make sure it stays clean, and not use any shady practices? RWKV...

**Describe the bug** Running model from a GGUF file using [llama.cpp](https://github.com/ggerganov/llama.cpp) is very straightforward, just like that: `server -v -ngl 99 -m Phi-3-mini-4k-instruct-Q6_K.gguf` and if model is supported, it just...

new feature

### Name and Version .\llama-cli.exe --version version: 4491 (c67cc983) built with MSVC 19.39.33523.0 for x64 ### Operating systems Windows ### Which llama.cpp modules do you know to be affected? Python/Bash...

bug-unconfirmed
stale

### Name and Version llama-cli --version version: 4713 (a4f011e8) built with MSVC 19.42.34436.0 for x64 ### Operating systems Windows ### Which llama.cpp modules do you know to be affected? llama-server...

bug

JSON files must be UTF-8 encoded, see [8.1 in RFC 8259](https://www.rfc-editor.org/rfc/rfc8259#section-8.1).

script
python

### Prerequisites - [x] I am running the latest code. Mention the version if possible as well. - [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md). - [x] I searched using keywords...

enhancement

GGUF (llama.cpp) is a very popular format used for handling quantized models - would be nice to see evaluation of that, and also quants like Q6_K (0.16% PPL difference vs...