TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

Conversion to int8 with trtexec fails

Open DaraOrange opened this issue 2 years ago • 6 comments

Description

I am trying to convert onnx model to int8 with latest TensorRT. I got the following error:

[05/19/2023-14:42:31] [E] Error[2]: Assertion getter(i) != 0 failed. 
[05/19/2023-14:42:33] [E] Error[2]: [weightConvertors.cpp::quantizeBiasCommon::310] Error Code 2: Internal Error (Assertion getter(i) != 0 failed. )
[05/19/2023-14:42:33] [E] Engine could not be created from network
[05/19/2023-14:42:33] [E] Building engine failed
[05/19/2023-14:42:33] [E] Failed to create engine from model or file.
[05/19/2023-14:42:33] [E] Engine set up failed

But there were no errors before this lines. What does it mean?

Environment

TensorRT Version: 8.6.0.12

NVIDIA GPU: NVIDIA GeForce RTX 3080 Ti

NVIDIA Driver Version: 530.30.02

CUDA Version: 12.1

Operating System: Ubuntu 18.04.6 LTS Bionic

Python Version (if applicable): 3.8

PyTorch Version (if applicable): 2.0.1+cu117

Steps To Reproduce

I use trtexec, my command looks like this trtexec --onnx=/repo/int8-engine.trt/end2end.onnx --calib=calib_data.h5 --int8 --saveEngine=/repo/int8-engine.trt/end2end.trt --staticPlugins=/mmdeploy/buil/lib/libmmdeploy_tensorrt_ops.so --shapes=input:1x3x800x1300 --verbose

DaraOrange avatar May 19 '23 14:05 DaraOrange

Can you please share all files needed to reproduce this error? This includes ONNX file, calibration cache, and steps to build the plugin for example.

gcunhase avatar May 19 '23 16:05 gcunhase

onnx file: https://drive.google.com/file/d/1OL7FC4mmGQQqjWAaYLdxcbDfesmSu_S8/view?usp=sharing calibration data: https://drive.google.com/file/d/1ob-tog2o0DDnDlFzN-p35gjLIpbaKKUw/view?usp=sharing pre-built ops: https://drive.google.com/file/d/18GBp1-ounLNqEReDOLchgUKqYzFDnC8F/view?usp=sharing (or you can build this .so as described here: https://github.com/OpenGVLab/InternImage/tree/master/detection)

DaraOrange avatar May 19 '23 20:05 DaraOrange

Could you please also upload a full TRT verbose log here? At a quick glance, it looks like some thing is wrong with your scale.

zerollzeng avatar May 21 '23 14:05 zerollzeng

I suppose, problem is in calibration file. Now I have such a problem [05/22/2023-06:57:41] [E] Error[1]: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered) Something is wrong with format. I made this calibration file with mmdeploy (and got h5 file). Could you please tell me, what format is needed for trtexec?

DaraOrange avatar May 22 '23 08:05 DaraOrange

@zerollzeng I have the same issue: https://github.com/open-mmlab/mmdeploy/issues/2204 [06/21/2023-12:58:13] [TRT] [E] 2: [weightConvertors.cpp::quantizeBiasCommon::337] Error Code 2: Internal Error (Assertion getter(i) != 0 failed. ) Here is the my full TRT log pose-detection_tensorrt-int8_static-256x256_epose_debug.zip

shimen avatar Jun 28 '23 07:06 shimen

@DaraOrange Привет, а можешь подсказать, как ты добыла calib_data? Мне вот тоже надо сформировать calib_data для trtexec.

Egorundel avatar Jul 29 '24 08:07 Egorundel