supriyar

Results 18 comments of supriyar

@jerryzh168 we have a way to preserve these ops in export, right?

torchao quantizes Linear layers. However depending on batch-size and layer shape you may see different levels of performance improvements for different techniques. Eg. weight-only works best for bs=1 while dynamic...

Hi @ynimmaga, nice to see you here - I believe we met last year at the PTC poster sessions and after to discuss how to use PyTorch quantization with OpenViNO....

@ynimmaga what kind of use cases do you have in mind? We won't have the bandwidth to support OpenVINO specifically but if that's something your team would like to contribute...

@jerryzh168 @kwen2501 is this addressed now with quantize + distributed inference composability work?

cc @HDCharles who has been looking into MoE quantization and grouped gemm recently

I believe https://github.com/pytorch/ao/issues/2147 has some details, but is likely not the full list cc @danielvegamyhre @vkuzo