BitNet icon indicating copy to clipboard operation
BitNet copied to clipboard

Official inference framework for 1-bit LLMs

Results 227 BitNet issues
Sort by recently updated
recently updated
newest added

While profiling BitNet inference (single-threaded run_inference), I observed that the function `ggml_vec_dot_i2_i8_s` (which performs a multiply-accumulate on 1.58-bit (ternary) and 8-bit data) dominates the runtime (I dont recall precisely, but...

Does [bitnet-b1.58-2B-4T](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T) model support tool calling ? If so could you suggest how we can use this feature with bitnet.cpp ? - https://github.com/ggml-org/llama.cpp/blob/master/docs/function-calling.md

As we can see, there are three newest official bitnet 2b models: model1: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T model2: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-bf16 model3: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf The info I know is that model1 used for gpu inference, model2...

Is any one have code of https://bitnet-demo.azurewebsites.net/

Following all the steps as provided, uninstalled and reinstalled everything, error persists. D:\BitNet\3rdparty\llama.cpp\common\common.cpp(445,32): error : no type named 'system_clock' in namespace 'std::chrono' [compile.log](https://github.com/user-attachments/files/20035110/compile.log) [generate_build_files.log](https://github.com/user-attachments/files/20035111/generate_build_files.log) [install_gguf.log](https://github.com/user-attachments/files/20035112/install_gguf.log)

sorry for my ignorance but the official paper mentions that GPU kernels will be released ... did that happen already or am i just looking at the wrong places Thanks

I want to continue pretraining the 1.58b 2B model to add more on my language. Or finetune for specific knowledge. Are there any base code i could start with to...

I am no expert, so please forgive the naive questions, but: 1) Is there any way to integrate KBLaM into these models? 2) Is it possible to fine-tune the models...

```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "microsoft/bitnet-b1.58-2B-4T" # Load tokenizer and model tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, trust_remote_code=True ) ``` error: ``` [/usr/local/lib/python3.11/dist-packages/transformers/utils/hub.py](https://localhost:8080/#)...