serving Problem using Tensorflow serving and TF-TRT model

Hi, I'm serving a model that I converted from TF to TRT . I'm using the docker image tensorflow/serving:latest-gpu

I can load the model perfectly by opening the gRPC and REST ports, but when I'm making the inference through a client script, I get the following error:

2021-09-13 03:30:26.137816: W external/org_tensorflow/tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvrtc.so.11.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/extras/CUPTI/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2021-09-13 03:30:26.137847: F external/org_tensorflow/tensorflow/compiler/tf2tensorrt/stub/nvinfer_stub.cc:49] getInferLibVersion symbol not found.

I don't know if LD_LIBRARY_PATH is wrong by default, because I'm using the latest image, it shouldn't give me this error.

So I changed the image for latest-devel-gpu version to edit the LD_LIBRARY_PATH through

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.1/targets/x86_64-linux/lib aiming find libnvrtc.so.11.1

the LD_LIBRARY_PATH already had the correct path to libnvinfer.so.7 which was /usr/lib/x86_64-linux-gnu , so I guess I don't need to change anything related.

After I did these changes, I fixed the error, but I received a new error:

2021-09-13 03:57:21.997684: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:328] failed call to cuInit: UNKNOWN ERROR (304) 2021-09-13 03:57:21.997746: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (2f530a09a11c): /proc/driver/nvidia/version does not exist

and in sequence :

2021-09-13 03:58:21.538300: I external/org_tensorflow/tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libnvinfer_plugin.so.7 terminate called after throwing an instance of 'pwgen::PwgenException' what(): Driver error:

Does anyone have an idea of how to solve the problem?

Sep 13 '21 04:09 audrey-siqueira

@audrey-siqueira

Could you please refer the similar issues link1 and link2, and let us know if it helps.Thanks

Sep 14 '21 04:09 UsharaniPagadala

I was able to solve the LD_LIBRARY_PATH and CUDA recognition problem using the following command:

sudo docker run -e "LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.1/targets/x86_64-linux/lib" --rm --runtime=nvidia -p 8500:8500 -p 8501:8501 --name clasificacion-gpu -v pwd/MY_MODEL:/models/MY_MODEL -e MODEL_NAME=MY_MODEL -t tensorflow/serving:latest-gpu

the model is loaded perfectly, however when I make the request, the logs below are activated, apparently with success, but after these logs, nothing is returned, maybe the problem is in my client script.

2021-09-15 01:56:01.564354: I external/org_tensorflow/tensorflow/compiler/tf2tensorrt/common/utils.cc:58] Linked TensorRT version: 7.2.2 2021-09-15 01:56:02.016981: I external/org_tensorflow/tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libnvinfer.so.7 2021-09-15 01:56:02.018394: I external/org_tensorflow/tensorflow/compiler/tf2tensorrt/common/utils.cc:60] Loaded TensorRT version: 7.2.2 2021-09-15 01:56:02.079982: I external/org_tensorflow/tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libnvinfer_plugin.so.7

Anyone has any solution?

Sep 15 '21 02:09 audrey-siqueira

I'm also facing similar issues. Any thoughts on this?

Jan 28 '22 17:01 GuilhermeViveiros

any progress for this problem?

Sep 19 '22 11:09 liuxingbin

i suppose the problem with libnvinfer.so.7 is just a warning (W) and the line following that (with F) is creating the actual issue?

2021-09-13 03:30:26.137847: F external/org_tensorflow/tensorflow/compiler/tf2tensorrt/stub/nvinfer_stub.cc:49] getInferLibVersion symbol not found.

Jan 24 '23 08:01 subhalingamd

@audrey-siqueira,

I saw a similar issue in NVIDIA/TensorRT repo, and they had some fix with the pwgen which is internal module inside trt. The fix was introduced in TRT 8.0 release. Please try TRT> 8.0 release and let us know if you face any issue.

Thank you!

Jan 31 '23 07:01 singhniraj08

Closing this due to inactivity. Please take a look into the answers provided above, feel free to reopen and post your comments(if you still have queries on this). Thank you!

Feb 17 '23 12:02 singhniraj08