jpyo0803

Results 3 issues of jpyo0803

Hi, I am wondering how to quantize llama3-8B with smoothquant. What dataset did you use to generate activation scale? Or do you plan to upload act_scales, model weights (to huggingface),...

I am wondring if a matrix multiplication with 32-bit integer inputs / output possible with triton?

Hi, I am wondering what "transformers" version you did refer to write "int_llama_layer.py"? Thanks in advance!