BitNet issues

Optimize GPU checkpoint loading by ensuring model transfer before load_state_dict on build method

4

While reviewing the code in this repository, I noticed a few areas that could be optimized for efficiency. I decided to make some changes to how the models are loaded...

nimanikoo

When running BitNet on a GPU, is it not possible to use a GGUF file?

GoNow1231

Add support for GPU int2 prefill

2

Add gemm kernel for int2 weight. Also fix scaling problems in previous bitlinear kernel.

huing4257

Fixed "Falcon3" Base Model Download Error

1

Fixed error in `setup_env.py` that prevented downloading of **_Falcon3 Base_** models

sandroXP2007

How to use this model on Windows with structured JSON output and persistent system instructions (like OpenAI Playground)

1

Hi team, I'm trying to use this model locally on Windows with minimal system dependencies, ideally via llama.cpp. In OpenAI Playground, we can set a System Instruction and the model...

Yash-Y09

Bug Report: Unexpected or incorrect behavior observed

A bug has been reported, but further details are needed to provide actionable steps for reproduction and resolution. Please specify the observed behavior, expected outcome, and any relevant environment or...

windejsi62

Update bitnet_kernels.h

specify alignment of A_local

Ogiwara-CostlierRain464

Drop-in replacement needed for f.linear?

``` class BitLinearInference(nn.Module): def __init__(self, in_features: int, out_features: int, ): super().__init__() self.in_f = in_features self.out_f = out_features self.register_buffer("w", torch.empty((out_features, in_features))) self.register_buffer("w_scale", torch.empty((1,), dtype=torch.float32)) self.norm = nn.RMSNorm( normalized_shape=in_features, eps=1e-5, elementwise_affine=True )...

tanvoontao

[build] require for parallel compiling!

2

# Description BitNet.cpp is nearly impossible to deploy on a low-ended arm device due to the low IO rate of the chipset. By introducing parallel compiling, this issue could be...

houchen-li

I couldn’t find a specific GitHub issue titled “Cannot initialize bitnet.cpp on GPU

xennon-sudo

BitNet
BitNet copied to clipboard

Metadata

Optimize GPU checkpoint loading by ensuring model transfer before load_state_dict on build method

When running BitNet on a GPU, is it not possible to use a GGUF file?

Add support for GPU int2 prefill

Fixed "Falcon3" Base Model Download Error

How to use this model on Windows with structured JSON output and persistent system instructions (like OpenAI Playground)

Bug Report: Unexpected or incorrect behavior observed

Update bitnet_kernels.h

Drop-in replacement needed for f.linear?

[build] require for parallel compiling!

I couldn’t find a specific GitHub issue titled “Cannot initialize bitnet.cpp on GPU

← Metadata

Owner

Metadata

BitNet BitNet copied to clipboard

Metadata

← Metadata

Owner

Metadata

BitNet
BitNet copied to clipboard