BitNet icon indicating copy to clipboard operation
BitNet copied to clipboard

Official inference framework for 1-bit LLMs

Results 227 BitNet issues
Sort by recently updated
recently updated
newest added

While reviewing the code in this repository, I noticed a few areas that could be optimized for efficiency. I decided to make some changes to how the models are loaded...

Add gemm kernel for int2 weight. Also fix scaling problems in previous bitlinear kernel.

Fixed error in `setup_env.py` that prevented downloading of **_Falcon3 Base_** models

Hi team, I'm trying to use this model locally on Windows with minimal system dependencies, ideally via llama.cpp. In OpenAI Playground, we can set a System Instruction and the model...

A bug has been reported, but further details are needed to provide actionable steps for reproduction and resolution. Please specify the observed behavior, expected outcome, and any relevant environment or...

``` class BitLinearInference(nn.Module): def __init__(self, in_features: int, out_features: int, ): super().__init__() self.in_f = in_features self.out_f = out_features self.register_buffer("w", torch.empty((out_features, in_features))) self.register_buffer("w_scale", torch.empty((1,), dtype=torch.float32)) self.norm = nn.RMSNorm( normalized_shape=in_features, eps=1e-5, elementwise_affine=True )...

# Description BitNet.cpp is nearly impossible to deploy on a low-ended arm device due to the low IO rate of the chipset. By introducing parallel compiling, this issue could be...