BitNet issues

Performance optimising `ggml_vec_dot_i2_i8_s` with AVX-512 VNNI

1

While profiling BitNet inference (single-threaded run_inference), I observed that the function `ggml_vec_dot_i2_i8_s` (which performs a multiply-accumulate on 1.58-bit (ternary) and 8-bit data) dominates the runtime (I dont recall precisely, but...

HJLebbink

So smart

2

ver007

Support for llama.cpp server function calling

3

Does [bitnet-b1.58-2B-4T](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T) model support tool calling ? If so could you suggest how we can use this feature with bitnet.cpp ? - https://github.com/ggml-org/llama.cpp/blob/master/docs/function-calling.md

vijaysaayi

How to convert a gguf model by myself, rather using official gguf model?

2

As we can see, there are three newest official bitnet 2b models: model1: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T model2: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-bf16 model3: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf The info I know is that model1 used for gpu inference, model2...

BradZhone

Code of https://bitnet-demo.azurewebsites.net

1

Is any one have code of https://bitnet-demo.azurewebsites.net/

rcsarv

Compile Error Windows 11

1

Following all the steps as provided, uninstalled and reinstalled everything, error persists. D:\BitNet\3rdparty\llama.cpp\common\common.cpp(445,32): error : no type named 'system_clock' in namespace 'std::chrono' [compile.log](https://github.com/user-attachments/files/20035110/compile.log) [generate_build_files.log](https://github.com/user-attachments/files/20035111/generate_build_files.log) [install_gguf.log](https://github.com/user-attachments/files/20035112/install_gguf.log)

MilitantHitchhiker

GPU kernels?

1

sorry for my ignorance but the official paper mentions that GPU kernels will be released ... did that happen already or am i just looking at the wrong places Thanks

fixerivan

FineTune the 1.58b

I want to continue pretraining the 1.58b 2B model to add more on my language. Or finetune for specific knowledge. Are there any base code i could start with to...

abal1000x

KBLaM and/or fine-tuning?

1

I am no expert, so please forgive the naive questions, but: 1) Is there any way to integrate KBLaM into these models? 2) Is it possible to fine-tune the models...

rog77

OSError: microsoft/bitnet-b1.58-2B-4T does not appear to have a file named configuration_bitnet.py.

1

```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "microsoft/bitnet-b1.58-2B-4T" # Load tokenizer and model tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, trust_remote_code=True ) ``` error: ``` [/usr/local/lib/python3.11/dist-packages/transformers/utils/hub.py](https://localhost:8080/#)...

tikendraw

BitNet
BitNet copied to clipboard

Metadata

Performance optimising `ggml_vec_dot_i2_i8_s` with AVX-512 VNNI

So smart

Support for llama.cpp server function calling

How to convert a gguf model by myself, rather using official gguf model?

Code of https://bitnet-demo.azurewebsites.net

Compile Error Windows 11

GPU kernels?

FineTune the 1.58b

KBLaM and/or fine-tuning?

OSError: microsoft/bitnet-b1.58-2B-4T does not appear to have a file named configuration_bitnet.py.

← Metadata

Owner

Metadata

BitNet BitNet copied to clipboard

Metadata

← Metadata

Owner

Metadata

BitNet
BitNet copied to clipboard