Nianhui Guo

Results 4 issues of Nianhui Guo

Great work and thanks a lot for opening up the great work! While reusing the code released, I found some issues below: I can not reproduce the W1A1 version BiT...

Hi, really good work, and appreciate it a lot. I am curious whether Triton can support 1-bit acceleration for MMA. Also the further application to 1-bit GPTQ?

Really solid work! May I ask what the actual compressed model size is, considering that it is a partial binarization way and there are some 8-bit parameters inside each weight...

Hi, thanks for your great work and the open decision. I am trying different quantization group size (128 to 64/32) by changing the default hyperparameter '''group_size''', but the GEMM results...