Justin Stoecker issues

Results 2 issues of


                                            Justin Stoecker

AWQ fails on ONNX model when a MatMul node's input is a model input/initializer

Hello, The [awq_quantize](https://github.com/intel/neural-compressor/blob/42c2def02e128818f19d8342052ab0544e9623f7/neural_compressor/adaptor/ox_utils/weight_only.py#L703) function [collects the names of input tensors to each MatMul node](https://github.com/intel/neural-compressor/blob/42c2def02e128818f19d8342052ab0544e9623f7/neural_compressor/adaptor/ox_utils/weight_only.py#L758-L764), and later [looks up the parent node that produces the named tensor](https://github.com/intel/neural-compressor/blob/42c2def02e128818f19d8342052ab0544e9623f7/neural_compressor/adaptor/ox_utils/weight_only.py#L783). This assumes the tensors...

[QST] Quantized conv with s8 output and s32 bias

When implementing a quantized GEMM/convolution with INT8 activations and weights, it's common to also have the bias as INT32. The usual trick for adding a bias seems to be initializing...

question

? - Needs Triage

inactive-30d