Tairen Piao

Results 2 issues of Tairen Piao

Dear AIMET team, In the C++/CUDA quantization kernel, many functions use `int cnt`. However, for some large models (e.g., LLaMA, Stable Diffusion), cnt can overflow the range of a 32-bit...

Dear AIMET team, In the C++/CUDA quantization kernel, many functions use `int cnt`. However, for some large models (e.g., LLaMA, Stable Diffusion), cnt can overflow the range of a 32-bit...