Yuxuan Hu

Results 3 comments of Yuxuan Hu

> I tried to test GPTQ's PPL metrics on the opt model via opt.py. The PPL metrics of the opt model are normal with the use of fake quantization. However,...

Thank you very much for your reply! The input consists of two activations X1[L, D1], X2[L, D2] and two weight matrices W1[D, D1], W2[D, D2], where $L = 2048, D_1...

I have the same question. Meanwhile, if the sparse tensor core is not supported now, can we implement a load in sparse and compute in dense kernel based on triton?