Problem using Tensorflow serving and TF-TRT model
Hi, I'm serving a model that I converted from TF to TRT . I'm using the docker image tensorflow/serving:latest-gpu
I can load the model perfectly by opening the gRPC and REST ports, but when I'm making the inference through a client script, I get the following error:
2021-09-13 03:30:26.137816: W external/org_tensorflow/tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvrtc.so.11.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/extras/CUPTI/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2021-09-13 03:30:26.137847: F external/org_tensorflow/tensorflow/compiler/tf2tensorrt/stub/nvinfer_stub.cc:49] getInferLibVersion symbol not found.
I don't know if LD_LIBRARY_PATH is wrong by default, because I'm using the latest image, it shouldn't give me this error.
So I changed the image for latest-devel-gpu version to edit the LD_LIBRARY_PATH through
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.1/targets/x86_64-linux/lib aiming find libnvrtc.so.11.1
the LD_LIBRARY_PATH already had the correct path to libnvinfer.so.7 which was /usr/lib/x86_64-linux-gnu , so I guess I don't need to change anything related.
After I did these changes, I fixed the error, but I received a new error:
2021-09-13 03:57:21.997684: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:328] failed call to cuInit: UNKNOWN ERROR (304) 2021-09-13 03:57:21.997746: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (2f530a09a11c): /proc/driver/nvidia/version does not exist
and in sequence :
2021-09-13 03:58:21.538300: I external/org_tensorflow/tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libnvinfer_plugin.so.7 terminate called after throwing an instance of 'pwgen::PwgenException' what(): Driver error:
Does anyone have an idea of how to solve the problem?
@audrey-siqueira
Could you please refer the similar issues link1 and link2, and let us know if it helps.Thanks
I was able to solve the LD_LIBRARY_PATH and CUDA recognition problem using the following command:
sudo docker run -e "LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.1/targets/x86_64-linux/lib" --rm --runtime=nvidia -p 8500:8500 -p 8501:8501 --name clasificacion-gpu -v pwd/MY_MODEL:/models/MY_MODEL -e MODEL_NAME=MY_MODEL -t tensorflow/serving:latest-gpu
the model is loaded perfectly, however when I make the request, the logs below are activated, apparently with success, but after these logs, nothing is returned, maybe the problem is in my client script.
2021-09-15 01:56:01.564354: I external/org_tensorflow/tensorflow/compiler/tf2tensorrt/common/utils.cc:58] Linked TensorRT version: 7.2.2 2021-09-15 01:56:02.016981: I external/org_tensorflow/tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libnvinfer.so.7 2021-09-15 01:56:02.018394: I external/org_tensorflow/tensorflow/compiler/tf2tensorrt/common/utils.cc:60] Loaded TensorRT version: 7.2.2 2021-09-15 01:56:02.079982: I external/org_tensorflow/tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libnvinfer_plugin.so.7
Anyone has any solution?
I'm also facing similar issues. Any thoughts on this?
any progress for this problem?
i suppose the problem with libnvinfer.so.7 is just a warning (W) and the line following that (with F) is creating the actual issue?
2021-09-13 03:30:26.137847: F external/org_tensorflow/tensorflow/compiler/tf2tensorrt/stub/nvinfer_stub.cc:49] getInferLibVersion symbol not found.
@audrey-siqueira,
I saw a similar issue in NVIDIA/TensorRT repo, and they had some fix with the pwgen which is internal module inside trt. The fix was introduced in TRT 8.0 release. Please try TRT> 8.0 release and let us know if you face any issue.
Thank you!
Closing this due to inactivity. Please take a look into the answers provided above, feel free to reopen and post your comments(if you still have queries on this). Thank you!