llama2.zig issues

Bump transformers from 4.34.0 to 4.36.0

Bumps [transformers](https://github.com/huggingface/transformers) from 4.34.0 to 4.36.0. Release notes Sourced from transformers's releases. v4.36: Mixtral, Llava/BakLlava, SeamlessM4T v2, AMD ROCm, F.sdpa wide-spread support New model additions Mixtral Mixtral is the new...

dependabot[bot]

dependencies

Add support for 8-bit Quantization

See: - https://github.com/karpathy/llama2.c/issues/277 - https://github.com/karpathy/llama2.c/pull/298 - https://github.com/karpathy/llama2.c/pull/312 - https://github.com/karpathy/llama2.c/pull/364 - https://github.com/ggerganov/llama.cpp/issues/397 - https://arxiv.org/pdf/2101.01321v3.pdf

clebert

enhancement

Fix chat output of llama2_7b_chat_uncensored model

Model: https://huggingface.co/georgesung/llama2_7b_chat_uncensored

clebert

bug

Unit mismatch in SIMD vector length?

In `simd.zig`, the vector length is computed as ```zig comptime var vector_len = std.atomic.cache_line / @sizeOf(f32); ``` It looks to me like there is a unit mismatch here. `std.atomic.cache_line` is...

DarthPumpkin

llama2.zig
llama2.zig copied to clipboard

Metadata

Bump transformers from 4.34.0 to 4.36.0

Add support for 8-bit Quantization

Fix chat output of llama2_7b_chat_uncensored model

Unit mismatch in SIMD vector length?

← Metadata

Owner

Metadata

llama2.zig llama2.zig copied to clipboard

Metadata

Bump transformers from 4.34.0 to 4.36.0

Add support for 8-bit Quantization

Fix chat output of llama2_7b_chat_uncensored model

Unit mismatch in SIMD vector length?

← Metadata

Owner

Metadata

llama2.zig
llama2.zig copied to clipboard