jean-christophe datin
jean-christophe datin
my CPU arch is x64_64 . trying main branch. keep in touch
much better
I have a question : in previous transition from ORT 1.5 to 1.16 I had to add flag --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=75" to fix a build issue . Is this flag still...
ok, now with main latest 1.17.0 builds. So can fix the issue and wait for the new 1.17.x about --cmake_extra_defines, what is the effect ? Does it means onnxrt +...
I do not control how the cuda code is compiled , got the Cuda 12.2 packages from Nvidia repo. But are you talking about the 1.17.0 releasE with my issue...
Thx . How can I avoid to use gcc7 to compile ONNX Runtimes CUDA code ? I did nothing for that : just calling RUN CC=gcc-11 CXX=g++-11 ./build.sh
I tried --cmake_extra_defines CMAKE_CUDA_HOST_COMPILER and it fixed the problem. Closing the case
closing since I realized that with ORT 1.16.3 I succeeded runing my model with TRT and it gets faster than Cuda EP in TF32
I tested again my model with latest onnxrt 1.17.1 and got same performance results between TRT EP and CUDA EP. I would have expected that TRT EP would have used...
even NonZero op seems implemented in TRT : could it be implemented in ONNXRT TRT EP ? With these 3 operator ALL of the faster-rcnns would run on TRT and...