Urgent unknown problem needs help！！！

Open YZW-explorer opened this issue 4 years ago • 1 comments

I want to implement the deployment of 4bit quantization on the PYNQ board. I saw that the MobileNet-v1 example in the example is quantified to 4bit, but I saw it on the FINN official website (https://finn.readthedocs.io/en/latest /source_code/finn.core.html) "Enum class that contains FINN data types to set the quantization annotation. ONNX does not support data types smaller than 8-bit integers, whereas in FINN we are interested in smaller integers down to ternary and bipolar." Why is this? How to solve this situation.

Mar 21 '21 04:03 YZW-explorer

Hi @YZW-explorer, good question! This is a limitation to ONNX for our purposes and exactly one of the reasons why we are using QONNX. By means of FINN's datatype annotation, we are able to go below 8-bit quantization, so you should be fine to experiment with 4-bit quantized models. For a simple example on how to export an MLP with 2-bit weights and activations, you could have a look here.

Feb 14 '23 12:02 mmrahorovic