LLaMA3-Quantization icon indicating copy to clipboard operation
LLaMA3-Quantization copied to clipboard

Evaluation of GGUF?

Open MoonRide303 opened this issue 2 years ago • 0 comments

GGUF (llama.cpp) is a very popular format used for handling quantized models - would be nice to see evaluation of that, and also quants like Q6_K (0.16% PPL difference vs fp16).

Evaluations for Llama 2 70B are available on llama.cpp project: https://github.com/ggerganov/llama.cpp/blob/master/examples/perplexity/README.md

MoonRide303 avatar Apr 23 '24 08:04 MoonRide303