XiaobingZhang comments

Results 21 comments of


                                            XiaobingZhang

add qscheme check for quantization observer

@jerryzh168 , from https://github.com/pytorch/pytorch/blob/caa6ef15a294c96fad3bf67a10a8b4fa605080bb/torch/ao/quantization/fx/_equalize.py#L59-L70, I am confused with those code, if the scheme is per_tensor, why use PerChannelMinMaxObserver?

add qscheme check for quantization observer

@pytorchbot rebase

add qscheme check for quantization observer

@pytorchbot merge

nnc: enable onednn conv+sum(relu) fusion

The following is the FP32 performance data of conv+add which is tested on SKX-6148(test script is https://github.com/XiaobingSuper/op_bench/blob/main/conv_add.py): 1. BS=1, thread=1. input size | output channels | kernel | stride |...

nnc: enable onednn conv+sum(relu) fusion

@ZolotukhinM , I clear the code which is not related this PR, which can be easily reviewed for you. Thanks!

[mkldnn_matmul] enable mkldnn matmul for aarch64 bf16 devices

@pytorchbot merge -g

Norm 1 of tensor - bfloat16 - rounding errors?

Yes, this is to be expected, there has an issue with norm reduce which doesn't use accumulate type.

Norm 1 of tensor - bfloat16 - rounding errors?

Yes, it was fixed by https://github.com/pytorch/pytorch/pull/95166.

inductor: rewrite mkldnn fx fusion using pattern_matcher(conv_unary)

@jansel @desertfire, please help review this code again. Thanks!

inductor: rewrite mkldnn fx fusion using pattern_matcher(conv_unary)

@pytorchbot merge