XiaobingZhang

Results 21 comments of XiaobingZhang

@jerryzh168 , from https://github.com/pytorch/pytorch/blob/caa6ef15a294c96fad3bf67a10a8b4fa605080bb/torch/ao/quantization/fx/_equalize.py#L59-L70, I am confused with those code, if the scheme is per_tensor, why use PerChannelMinMaxObserver?

The following is the FP32 performance data of conv+add which is tested on SKX-6148(test script is https://github.com/XiaobingSuper/op_bench/blob/main/conv_add.py): 1. BS=1, thread=1. input size | output channels | kernel | stride |...

@ZolotukhinM , I clear the code which is not related this PR, which can be easily reviewed for you. Thanks!

Yes, this is to be expected, there has an issue with norm reduce which doesn't use accumulate type.

Yes, it was fixed by https://github.com/pytorch/pytorch/pull/95166.

@jansel @desertfire, please help review this code again. Thanks!