[New Bitnet Model Support Request] Deepgrove model Bonsai 0.5B - Add Channel Scales

Open zephrus9 opened this issue 10 months ago • 1 comments

A new SOTA bitnet model, Bonsai 0.5B, has come out. Seems to outperform larger bitnet models like Falcon 1B, 3B, TriLM 700M. Seems like they are going to release a new line of bitnet models which is really exciting.

Support is needed for these models. They adopt a channel wise scaling factor compared to the tensor level ones. Maybe a separate kennel can be built to apply scales outside of the matmul kernels? Probably would yield similar inference speeds. Note that the hugging face does have a custom Q-linear layer that applies the scales.

HF: https://huggingface.co/deepgrove/Bonsai

Seems super promising.

pinging @Eddie-Wang1120 + other kernels writers

Other posts and information:

https://www.reddit.com/r/LocalLLaMA/comments/1jgkqio/new_bitnet_model_from_deepgrove/ https://x.com/deepgrove_ai/status/1903103798735761518

Mar 27 '25 00:03 zephrus9

Thanks for the notice. As from our experiment as below, for currently bitnet models, we recommend to use the one in our recent release. We will be happy to merge the Bonsai model if anyone could make a PR for that. thanks.

Apr 18 '25 03:04 sd983527