Yan Xia
Yan Xia
Thanks for the question. T-Mac introduces the lookup table methods for low bits model inference, which is generally capable for models such as 1-bit, 2-bits, 4 bits and so on,...
You can use this model for post training. https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-bf16
> I trashed my clone & rebuilt successfully by simply following the README instructions... > > I feel stupidðŸ˜... I had to edit code, adding `#include ` in `3rdparty\llama.cpp\common\common.cpp` and...
you can pull latest llama.cpp and merge our changes and build by yourself.
This model is currently English only, so any other languages like Chinese and so on are not expected to work well.
can you tell us your environment and the model you are using?
FAQ (Frequently Asked Questions)📌 Q1: The build dies with errors building llama.cpp due to issues with std::chrono in log.cpp? A: This is an issue introduced in recent version of llama.cpp....
There is a gguf model update on Hugging Face, which may cause this issue if you have not sync the code to latest version.
you should not download the fp version, it should be the gguf file instead. thus it will not trigger model conversion 
which local model are you using? As for my understanding, the demo site is running the exact same model as provided here https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf