jingyanwangms comments

Results 13 comments of


                                            jingyanwangms

Predictions of custom converted YOLOv4 model

@JonathanSamelson do you get the same error with the onnx conversion of standard yolov4 model? @jennifererwangg @ravimashru who added those two notebooks who might know more about this model

Add handle for empty fsdp_config[xla_fsdp_v2]

@JingyaHuang can you please review it?

Testing Hyperdrive notebooks

Same issue for PR 575 and PR 681. We're looking into it

changes to _maybe_log_save_evaluate() not reflected in optimum repo

It's caused by https://github.com/huggingface/transformers/commit/4f09d0fd888dbf2660313f9715992822acfb99ce Fixed in PR #1730

TensorRT EP's inference results are abnormal.

Hi @c1aude, thank you for detailed repro information. Unfortunately, when I try to run the script, I get below error. Can you please verify the script and share a version...

TensorRT EP's inference results are abnormal.

Yes this is a warning. It indicates TensorRT is skipping a specific tactic (optimization approach) due to an internal issue in the implementation (canImplement1). This should not block running your...

TensorRT EP's inference results are abnormal.

@samsonyilma Thank you for the simple repro. Yes I can see different result CUDA vs TensorRT with onnxruntime-gpu=1.19.2 with tensorrt=10.4.0. We're investigating on our side

TensorRT EP's inference results are abnormal.

@c1aude @BengtGustafsson We have trouble repro your error on our end. We tried running the provided script with latest main (build from source) and do not see difference in CPU...

TensorRT EP's inference results are abnormal.

@BengtGustafsson in our testing, we can see variance on A100 but not on V100. So it's architecture dependent. What GPU architecture are you using? We asked Nvidia people in our...

TensorRT EP's inference results are abnormal.

@BengtGustafsson can you give `export NVIDIA_TF32_OVERRIDE=0` a try? This disables TF32 Tensor Cores optimization which is on by default in A100. My testing (ORT 1.18.2, TRT10.4) shows there's no more...