torch2trt_dynamic RuntimeError: NVRTC error when convert model

This error occurs when I convert any model, the error log is

cooperative_groups_helpers.h(87): error: identifier "cudaCGSynchronizeGrid" is undefined

1 error detected in the compilation of "generatedNativePointwise".

Traceback (most recent call last): File "/media/xxx/datadist/proj/skull_stripping_seg/convert_model2trt.py", line 47, in <module> model_trt = torch2trt_dynamic(model, [x], fp16_mode=False) File "/home/xxx/anaconda3/envs/py37/lib/python3.7/site-packages/torch2trt_dynamic/torch2trt_dynamic.py", line 565, in torch2trt_dynamic engine = builder.build_engine(network, config)

RuntimeError: NVRTC error:

The official torch2trt works with no error.

Is there any idea to solve this?

Feb 26 '21 07:02 HHZ94

Hi Could you please provide more detail so I can reproduce this error? Such as pytorch version, cuda/cudnn version, test script and anything related?

By the way, I found something in TensorRT install guide:

Note: If you are developing an application that is being compiled with CUDA 11.2 or you are using CUDA 11.2 libraries to run your application, then you must install CUDA 11.1 using either the Debian/RPM packages or using a CUDA 11.1 tar/zip/exe package. NVRTC from CUDA 11.1 is a runtime requirement of TensorRT and must be present to run TensorRT applications. If you are using the network repo installation method, this additional step is not needed.

not sure if it related or not.

Feb 26 '21 08:02 grimoire

Hi Could you please provide more detail so I can reproduce this error? Such as pytorch version, cuda/cudnn version, test script and anything related?

By the way, I found something in TensorRT install guide:

Note: If you are developing an application that is being compiled with CUDA 11.2 or you are using CUDA 11.2 libraries to run your application, then you must install CUDA 11.1 using either the Debian/RPM packages or using a CUDA 11.1 tar/zip/exe package. NVRTC from CUDA 11.1 is a runtime requirement of TensorRT and must be present to run TensorRT applications. If you are using the network repo installation method, this additional step is not needed.

not sure if it related or not.

I found maybe linspace during my model forward cause this error. When I remove it, anything works well. So is there any method to allow torch.linspace function in model forward?

Feb 26 '21 08:02 HHZ94

Simple nn.Module for test:

class Upsample(nn.Module):
    def __init__(self, scale_factor):
        super(Upsample, self).__init__()
        self.scale_factor = scale_factor

    def forward(self, x):
        x_size = (x.size(-2), x.size(-1))
        
        original_x_step = 1 / (x_size[-2] - 1)

        x_grid = torch.linspace(0, 1, x_size[0]).to(x.device)

        return x + x, x_grid

Feb 26 '21 08:02 HHZ94

Hi, The module conversion works on my device with driver 450 and cuda 10.2. Could you provide the evironment information?

Feb 27 '21 09:02 grimoire

Driver Version: 440.64
CUDA Version: 10.2 CUDNN: 8.0.5 torch 1.5.0 TensorRT 7.2.2.3

Mar 01 '21 10:03 HHZ94