Run Bert-like model failed in tensorflow-trt container: Could not initialize cublas
I have a model like Bert running on tensorflow 1. X, and the hardware is T4. And I'm trying to speed it up with TRT.
I found that NGC has provided docker integrated with tensorflow and TRT, so I used it.
In nvcr.io/nvidia/tensorflow:19.10-py3, the model test passed, but the performance didn't improve at all.
I think TRT 6 may not have any acceleration effect on the model, and then I tried to use nvcr.io/nvidia/tensorflow:20.12-tf1-py3,which integrates TRT 7. but I got an exception:

Then I tested 20.01/20.03 release and found that there seems to be the same problem on 20.xx. The libcublas.so has been loaded successfully, but it indicates that cublas initialization failed. I can confirm that no other program uses GPU,How can I solve this problem?
I have also met this problem,do you have find solution? Do you find some code can test the performance of TF-TRT on BERT like model?