FasterTransformer
FasterTransformer copied to clipboard
The int8 model saved by run_squad can't import by effective_transformer
Branch/Tag/Commit
main
Docker Image Version
nvcr.io/nvidia/tensorflow:20.06-tf1-py3-gcc9-th1.7
GPU name
A100
CUDA Driver
455.23.05
Reproduced Steps
I try to use model export by run_squad and load it into effective_transformer, I got an error saying 'bert/encoder/layer_0/amaxList:0' not exist. Is there any way to save int8 model with amaxList?