🐛 [Bug] Torch-TRT QDQ nodes affect perf vs PTQ, native TRT they do not

Open ncomly-nvidia opened this issue 3 years ago • 0 comments

Bug Description

When using the PyT-QAT toolkit, QAT perf is slower than PTQ, for TRT this is not the case.

Torch-TRT:

Native TRT:

Steps to reproduce the behavior:

QDQ affect on perf is the same between TRT & Torch-TRT

Build information about Torch-TensorRT can be found by turning on debug messages

DLFW 22.04: nvcr.io/nvidia/pytorch:22.04-py3

Aug 30 '22 17:08 ncomly-nvidia