RuntimeError: NVRTC error when convert model
This error occurs when I convert any model, the error log is
cooperative_groups_helpers.h(87): error: identifier "cudaCGSynchronizeGrid" is undefined
1 error detected in the compilation of "generatedNativePointwise".
Traceback (most recent call last): File "/media/xxx/datadist/proj/skull_stripping_seg/convert_model2trt.py", line 47, in <module> model_trt = torch2trt_dynamic(model, [x], fp16_mode=False) File "/home/xxx/anaconda3/envs/py37/lib/python3.7/site-packages/torch2trt_dynamic/torch2trt_dynamic.py", line 565, in torch2trt_dynamic engine = builder.build_engine(network, config)
RuntimeError: NVRTC error:
The official torch2trt works with no error.
Is there any idea to solve this?
Hi Could you please provide more detail so I can reproduce this error? Such as pytorch version, cuda/cudnn version, test script and anything related?
By the way, I found something in TensorRT install guide:
Note: If you are developing an application that is being compiled with CUDA 11.2 or you are using CUDA 11.2 libraries to run your application, then you must install CUDA 11.1 using either the Debian/RPM packages or using a CUDA 11.1 tar/zip/exe package. NVRTC from CUDA 11.1 is a runtime requirement of TensorRT and must be present to run TensorRT applications. If you are using the network repo installation method, this additional step is not needed.
not sure if it related or not.
Hi Could you please provide more detail so I can reproduce this error? Such as pytorch version, cuda/cudnn version, test script and anything related?
By the way, I found something in TensorRT install guide:
Note: If you are developing an application that is being compiled with CUDA 11.2 or you are using CUDA 11.2 libraries to run your application, then you must install CUDA 11.1 using either the Debian/RPM packages or using a CUDA 11.1 tar/zip/exe package. NVRTC from CUDA 11.1 is a runtime requirement of TensorRT and must be present to run TensorRT applications. If you are using the network repo installation method, this additional step is not needed.
not sure if it related or not.
I found maybe linspace during my model forward cause this error. When I remove it, anything works well. So is there any method to allow torch.linspace function in model forward?
Simple nn.Module for test:
class Upsample(nn.Module):
def __init__(self, scale_factor):
super(Upsample, self).__init__()
self.scale_factor = scale_factor
def forward(self, x):
x_size = (x.size(-2), x.size(-1))
original_x_step = 1 / (x_size[-2] - 1)
x_grid = torch.linspace(0, 1, x_size[0]).to(x.device)
return x + x, x_grid
Hi, The module conversion works on my device with driver 450 and cuda 10.2. Could you provide the evironment information?
Driver Version: 440.64
CUDA Version: 10.2
CUDNN: 8.0.5
torch 1.5.0
TensorRT 7.2.2.3