Yan Xia comments

Results 57 comments of


                                            Yan Xia

Significant code overlap between BitNet and T-MAC, what are the specific differences?

Thanks for the question. T-Mac introduces the lookup table methods for low bits model inference, which is generally capable for models such as 1-bit, 2-bits, 4 bits and so on,...

training code

You can use this model for post training. https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-bf16

`include/bitnet-lut-kernels.h` missing from repository, build fails

> I trashed my clone & rebuilt successfully by simply following the README instructions... > > I feel stupid😭... I had to edit code, adding `#include ` in `3rdparty\llama.cpp\common\common.cpp` and...

How to build llama.cpp python bindings

you can pull latest llama.cpp and merge our changes and build by yourself.

a demo sample problem

This model is currently English only, so any other languages like Chinese and so on are not expected to work well.

error loading model: PrefetchVirtualMemory unavailable

can you tell us your environment and the model you are using?

How can I install on window11?

FAQ (Frequently Asked Questions)📌 Q1: The build dies with errors building llama.cpp due to issues with std::chrono in log.cpp? A: This is an issue introduced in recent version of llama.cpp....

Can't run inference anymore

There is a gguf model update on Hugging Face, which may cause this issue if you have not sync the code to latest version.

Can't run inference anymore

you should not download the fp version, it should be the gguf file instead. thus it will not trigger model conversion ![Image](https://github.com/user-attachments/assets/1b043669-4e60-47e5-89dd-0eb4c8f677ee)

local bitnet model don't have the same quality as online bitnet

which local model are you using? As for my understanding, the demo site is running the exact same model as provided here https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf