Results 29 comments of wang7393

@nvpohanh Sorry, could you please provide more detailed usage

@nvpohanh [profile.zip](https://github.com/NVIDIA/TensorRT/files/9307103/profile.zip)

@nvpohanh Do you have a conclusion? My project is in a hurry. If the driver version is greater than R495.07, there will be a huge fluctuation in inference time. The...

@nvpohanh I generate trt models at runtime, that is, using c++ code to do model transformation and inference. How do I set this up?

@nvpohanh My code for measuring inference time is as follows: `auto time_2 = std::chrono::high_resolution_clock::now();` ` m_context->enqueueV2(m_buffers->getDeviceBindings().data(), m_stream, nullptr);` `cudaStreamSynchronize(m_stream);` `auto time_3 = std::chrono::high_resolution_clock::now();` `std::chrono::duration inferenceTime = (time_3 - time_2);` **Under...

@nvpohanh add test: In the above experiment, I added Cudagraph to my Runtime code, and after running for a while the inference time stabilized at 3-4ms was overturned because I...

@nvpohanh only --useCudaGraph can get better results. As with the other issues I mentioned earlier, how do I solve this problem under Runtime.

@nvpohanh under 473.47, the inference time is double. under 511.65, inference times still fluctuate wildly.

@nvpohanh If there are any new developments or conclusions, please synchronize them. Thanks.

@Iffa-Meah The model is a custom layer model. After check, there is no problem under gpu inference. But under cpu inference, the above problems still exist: under debug mode, the...