Vasiliy Kuznetsov

Results 111 comments of Vasiliy Kuznetsov

What's the context on which benchmark suite is this and what the goal is? In general more benchmarking of quantized models would be really valuable, let me know how our...

Sounds great. I'd recommend starting with `MobileNetV2` as something easy and which we already benchmark internally, so we can compare data. We can use FX graph mode quant on that...

cc @HDCharles who is interested in testing this out for mobilenetv2

@HDCharles , just so there is nothing blocked, I'd recommend sticking to training and inference only (no calibration) in the first PR. Then, if we decide that calibration is OK...

I'm a little worried about scope creep here before we actually know that this data is reliable. Would it make sense to just do the simplest possible thing first (just...

> @vkuzo I agree with that, iiuc you're referring to (c) in my list? yeah, that sounds great to me. If I had to rank a, b and c by...

> @HDCharles, please confirm if pretrained quantized models have to be calibrated for benchmarking, or if something like #417 would suffice for benchmarking. Thank you! as long as the model...

this is great! API looks good, I'll defer to others for the cutlass part.

Hi @Abhijit-2592 , In `MXInferenceLinear`, the relevant code snippet is: ``` new_mod.weight_mx = MXTensor.to_mx( mod.weight.t().contiguous(), elem_dtype, block_size=block_size ).t() ``` Source: https://github.com/pytorch/ao/blob/26e790df61e23b2ba340c36b84eb9940fec100bb/torchao/prototype/mx_formats/mx_linear.py#L86 There are two calls to `t()` in this snippet...