jingyanwangms
jingyanwangms
@JonathanSamelson do you get the same error with the onnx conversion of standard yolov4 model? @jennifererwangg @ravimashru who added those two notebooks who might know more about this model
@JingyaHuang can you please review it?
Same issue for PR 575 and PR 681. We're looking into it
It's caused by https://github.com/huggingface/transformers/commit/4f09d0fd888dbf2660313f9715992822acfb99ce Fixed in PR #1730
Hi @c1aude, thank you for detailed repro information. Unfortunately, when I try to run the script, I get below error. Can you please verify the script and share a version...
Yes this is a warning. It indicates TensorRT is skipping a specific tactic (optimization approach) due to an internal issue in the implementation (canImplement1). This should not block running your...
@samsonyilma Thank you for the simple repro. Yes I can see different result CUDA vs TensorRT with onnxruntime-gpu=1.19.2 with tensorrt=10.4.0. We're investigating on our side
@c1aude @BengtGustafsson We have trouble repro your error on our end. We tried running the provided script with latest main (build from source) and do not see difference in CPU...
@BengtGustafsson in our testing, we can see variance on A100 but not on V100. So it's architecture dependent. What GPU architecture are you using? We asked Nvidia people in our...
@BengtGustafsson can you give `export NVIDIA_TF32_OVERRIDE=0` a try? This disables TF32 Tensor Cores optimization which is on by default in A100. My testing (ORT 1.18.2, TRT10.4) shows there's no more...