pytorch_quantization Not same with the Paper

activation quantization is not same

In the paper : x(real) is in range[0 ~ 1] : clamp(input, 0, 1) then, quantize(x)

In your implementation: clamp(input * 0.1, 0, 1)

Apr 10 '20 05:04 bongjeong

dorefa paper says ：'Here we assume the output of the previous layer has passed through a bounded activation function h, which ensures r ∈ [0, 1].' But the paper does not specify what a bounded activation h is. I think multiplying activation by 0.1 can reduce the dynamic range of parameters and make the model perform better.I had an internship in megvii and they dealt with activation functions in this way.

Apr 10 '20 08:04 Jzz24

I think, DoReFa is fully integer calculation on the layers(without first, last layer). multiplying activation by 0.1 is not quantized format, it need floating point calculation on feature map(my guess). how do you think about it?

Apr 13 '20 10:04 bongjeong

yes, I think so. in training time, we use simulation quantization, the activation layer inputs is the dequantize result. it's float format.

Apr 13 '20 11:04 Jzz24