mrbean comments

Results 55 comments of


                                            mrbean

ONNXConfig: Add a configuration for all available models

@ChainYo I would like to get started on the ONNX Config for DeBERTaV2!

fix activation dtype

@philschmid @JingyaHuang should we put this in? Feels like a bug

support passing in user specified provider to from_pretrained

@philschmid mind re-reviewing?

Quantizer Not Respecting use_external_data_format

By setting the format to QDQ you can get around this but for the QOperator format seems broken.

Trying to Load Model Quantized for TensorRT Fails

@philschmid I have raised the concerns in the issue above

Trying to Load Model Quantized for TensorRT Fails

@philschmid I am also tracking in https://github.com/microsoft/onnxruntime/issues/12133 and https://github.com/microsoft/onnxruntime/issues/12173 but it is becoming unclear if the issue is truly there or if Optimum is creating a quantized model that can...

Trying to Load Model Quantized for TensorRT Fails

@philschmid I am also seeing this weird behavior https://github.com/NVIDIA/TensorRT/issues/2146. I thought this was an oddity of TensorRT but it seems like the same thing is happening when I use your...

Mrbean/add disable shape inference

@lewtun @michaelbenayoun @JingyaHuang what do you think?

Mrbean/add disable shape inference

pinging @echarlaix and @JingyaHuang once again. This is a blocker for quantizing very large models so would love to see this go in!

Mrbean/add disable shape inference

@lewtun I saw there were some tests around static quantization in `tests/test_optimization.py` so I put a unit test in there. That work for you?