2730gf
2730gf
Thank you for your reply. I am using the mmdetection framework, and the optimizer used is not `torch.nn.optim`, so an error will be reported. How can I solve this?
Hello, can anyone provide a solution?
Hello, I am reading the source code to try to solve the problem of optimizer; but I found another problem, if QAT does not specify dummy_input, there will be this...
Add my code here: `quantizer = QAT_Quantizer(model, config_list, optimizer) quantizer.compress()`
The link I refer to is as follows: https://github.com/microsoft/nni/blob/75e5d5b51f344201c32beceda94165acbd68fc44/examples/model_compress/end2end_compression.py#L220 Dummy_input is not used here, and whether the shape of this dummy_input needs to be the same as the input size...
@lix19937 Hello, can you help me take a look at this issue? It is also an inconsistency issue. https://github.com/NVIDIA/TensorRT/issues/4400
@lix19937 Thanks for your reply. I have updated the screenshot.
@lix19937 This model is fp32, and generally speaking, using fp32 inference will not produce such a large diff. And when we using trt8.6, the accuracy is completely aligned, but trt10.8...
@lix19937 Under polygraphy, TF32 is turned off by default. `[I] TF32 is disabled by default. Turn on TF32 for better performance with minor accuracy differences.` Initially, we discovered an accuracy...
@lix19937 Hello, it is not trained using torch, but it is indeed a transformer-based model. Is there any way to fix this accuracy issue?