Tairen Piao
Results
2
issues of
Tairen Piao
Dear AIMET team, In the C++/CUDA quantization kernel, many functions use `int cnt`. However, for some large models (e.g., LLaMA, Stable Diffusion), cnt can overflow the range of a 32-bit...
Dear AIMET team, In the C++/CUDA quantization kernel, many functions use `int cnt`. However, for some large models (e.g., LLaMA, Stable Diffusion), cnt can overflow the range of a 32-bit...