llama2.zig
llama2.zig copied to clipboard
Add support for 8-bit Quantization
See:
- https://github.com/karpathy/llama2.c/issues/277
- https://github.com/karpathy/llama2.c/pull/298
- https://github.com/karpathy/llama2.c/pull/312
- https://github.com/karpathy/llama2.c/pull/364
- https://github.com/ggerganov/llama.cpp/issues/397
- https://arxiv.org/pdf/2101.01321v3.pdf