Yuxuan Hu
Results
3
issues of
Yuxuan Hu
I tried to test GPTQ's PPL metrics on the opt model via opt.py. The PPL metrics of the opt model are normal with the use of fake quantization. However, when...
In CuTe, I can find files like "include/cute/arch/mma_sm90_gmma_sparse.hpp", but can't find similar files for sm80?
question
? - Needs Triage
inactive-30d
Dear Team, I wish to implement a fused mixed precision matrix multiplication such as w4a4 + w16a16 where the w16a16 part is small. An example of this kernel used is...
question
? - Needs Triage