Lewis Bails

Results 12 comments of Lewis Bails

Hey @AmaliePauli, I see you're using the BotXO weights for your BertTone model. Is that the version 1 or version 2 representations? https://github.com/botxo/nordic_bert

Thanks for that @fxmarty. That could certainly be the case for why PyTorch inference is faster on my machine! But regarding the ORT model performance, your models seem much quicker...

I had to make a few tweaks to your script to get around some errors that were popping up. Are you using `optimum==1.3.0`? These were my results on M1: ```...

Also, I had to go up to `atol=3` to get the logits comparison between the vanilla ONNX model and ONNX-quantized model to pass. Seems large, but I'm not familiar enough...

Running it again with the random input ids: ```python (Min, Max) PyTorch: ( -3.349, 3.752) (Min, Max) ONNX Runtime: (-3.32, 3.737) (Min, Max) ONNX Runtime quantized: (-5.626, 3.52) ```

I didn't explicitly send it to the Neural Engine / M1 GPU, do you know if this is something that happens under the hood?

For those that are still looking this up in the future. I managed to get it working by reshaping my tensors, concatenating them along the coreml-compliant dimension (in my case,...

Thanks for the issue. Feel free to submit a PR for the fix. I don't get much time to look at this package these days.

Hi Qichao, I'm flattered you used my package in your research. I'm very open to contributions. As you can see this project has gone a bit stale as of late....