jpyo0803
Results
3
issues of
jpyo0803
Hi, I am wondering how to quantize llama3-8B with smoothquant. What dataset did you use to generate activation scale? Or do you plan to upload act_scales, model weights (to huggingface),...
I am wondring if a matrix multiplication with 32-bit integer inputs / output possible with triton?
Hi, I am wondering what "transformers" version you did refer to write "int_llama_layer.py"? Thanks in advance!