pytorch_quantization icon indicating copy to clipboard operation
pytorch_quantization copied to clipboard

Not same with the Paper

Open bongjeong opened this issue 5 years ago • 3 comments

activation quantization is not same

In the paper : x(real) is in range[0 ~ 1] : clamp(input, 0, 1) then, quantize(x)

In your implementation: clamp(input * 0.1, 0, 1)

bongjeong avatar Apr 10 '20 05:04 bongjeong

dorefa paper says :'Here we assume the output of the previous layer has passed through a bounded activation function h, which ensures r ∈ [0, 1].' But the paper does not specify what a bounded activation h is. I think multiplying activation by 0.1 can reduce the dynamic range of parameters and make the model perform better.I had an internship in megvii and they dealt with activation functions in this way.

Jzz24 avatar Apr 10 '20 08:04 Jzz24

I think, DoReFa is fully integer calculation on the layers(without first, last layer). multiplying activation by 0.1 is not quantized format, it need floating point calculation on feature map(my guess). how do you think about it?

bongjeong avatar Apr 13 '20 10:04 bongjeong

yes, I think so. in training time, we use simulation quantization, the activation layer inputs is the dequantize result. it's float format.

Jzz24 avatar Apr 13 '20 11:04 Jzz24