Add variable bit width support to ONNXToDeepsparse

Open KSGulin opened this issue 2 years ago • 1 comments

This PR adds support for variable-bit weight quantization in the ONNXToDeepsparse exporter. This affects two steps:

Test Plan Local sparsify runs with int-4 weight quantization

Jul 03 '23 14:07 KSGulin

@bfineran good callout. Updated the array quantization routine and propagated the bit_width args to address this

Jul 03 '23 16:07 KSGulin