jingyu-ml comments

Results 4 comments of


                                            jingyu-ml

Lower-than-Expected Performance Improvement with INT8 Quantization in TensorRT 10.0 on A100 GPU

@teith What node are you using? Are you using AWS or GCP or your own machine? There are many other factors on this, for example memory sizes, concurrent workloads, temperature...

Lower-than-Expected Performance Improvement with INT8 Quantization in TensorRT 10.0 on A100 GPU

@TheBge12138 Thanks for the feedback, the current code base came from 6 months ago, which is different from the Blog, the team is refreshing the code on this repo and...

Lower-than-Expected Performance Improvement with INT8 Quantization in TensorRT 10.0 on A100 GPU

@teith is that possible for you to attach the int8 unet onnx file at somewhere?

Lower-than-Expected Performance Improvement with INT8 Quantization in TensorRT 10.0 on A100 GPU

@teith Apologies for the delayed response. I ran your models on our A100-PCIE-40G GPU. Here are the logs: [fp16.log](https://github.com/NVIDIA/TensorRT/files/15239280/fp16.log) [int8.log](https://github.com/NVIDIA/TensorRT/files/15239281/int8.log) FP16: ``` [05/07/2024-18:08:35] [I] Latency: min = 91.6599 ms, max...