cutlass icon indicating copy to clipboard operation
cutlass copied to clipboard

[QST] Quantized conv with s8 output and s32 bias

Open jstoecker opened this issue 9 months ago • 1 comments

When implementing a quantized GEMM/convolution with INT8 activations and weights, it's common to also have the bias as INT32. The usual trick for adding a bias seems to be initializing the C matrix to the bias with a stride of 0. This approach would require ElementC to be declared as INT32, yet I also want the output of the convolution to be INT8; it seems ElementC is also implicitly the output data type as far as I can tell.

It's not clear how to achieve what I want with 2.x APIs/epilogues. Do I need to use EVT to accomplish this?

jstoecker avatar Apr 30 '25 17:04 jstoecker

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

github-actions[bot] avatar May 30 '25 18:05 github-actions[bot]

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

github-actions[bot] avatar Aug 28 '25 19:08 github-actions[bot]