r2c icon indicating copy to clipboard operation
r2c copied to clipboard

cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

Open Tclz opened this issue 5 years ago • 2 comments

Hi, i meet a problem like this: File "train.py", line 131, in output_dict = model(**batch) File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "../models/multiatt/model.py", line 157, in forward obj_reps = self.detector(images=images, boxes=boxes, box_mask=box_mask, classes=objects, segms=segms) File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "../utils/detector.py", line 111, in forward img_feats = self.backbone(images) File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torchvision/models/resnet.py", line 98, in forward out = self.conv2(out) File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 338, in forward self.padding, self.dilation, self.groups) RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

The environment i use: python3.6.6 cuda9.0.176 cudnn7.5.1 torch1.1.0 torchvision0.3.0 and i have tried several environment configs(like cudnn7.4, torch1.0, etc) but none of them works. what should i do? thank you :)

Tclz avatar Sep 18 '20 02:09 Tclz

i also meet the problem, so i add two sentences in the train file. import torch torch.backends.cudnn.enabled = False

harukaza avatar Dec 11 '20 07:12 harukaza

@harukaza i fixed that after adjusting the environment : cuda10.1 cudnn7.6.4 python3.7 pytorch1.3.1 torchvision0.4.1

Tclz avatar Dec 14 '20 07:12 Tclz