nameli0722

Results 5 comments of nameli0722

I can't understand what you mean. I used int8 quantization and calibration set, and the inference result is also correct. It's large, but the GPU memory usage is larger than...

> > > > Hello, could you please provide the gpu usage and inference speed, with int8 and FP16? thank you! origin pt model: gpu usage 5099MB, inference time 1.7s;...

> > How about building the engine first and then load the engine, I think it can save some memory. > > Anyway I'll try to improve this. ./tinyexec --onnx...