tensorrt-cpp-api icon indicating copy to clipboard operation
tensorrt-cpp-api copied to clipboard

FP16 and INT8 inference speed

Open dadaligoudan opened this issue 5 months ago • 0 comments

Hi, thanks for your excellent work, it really helps me in my work. But i have a question here. I modified your code to infer my model. I used FP16 and INT8 precision, both modes have correct inference results. But the FP16 mode and INT8 mode has almost the same inference speed. I wonder why the INT8 mode does not infer faster than FP16 mode? Any sugessions, much thanks. ps: I used realesrgan-x4v3 model, and i transfer it to onnx format. Image

Image

dadaligoudan avatar Aug 06 '25 11:08 dadaligoudan