second.pytorch icon indicating copy to clipboard operation
second.pytorch copied to clipboard

cuda execution failed with error 2

Open zjuxiaobaiq opened this issue 6 years ago • 4 comments

Could anyone tell me how to solve this problem?

python 3.6 torch 1.0 cuda 10.0 cudnn 7.1.4 Traceback (most recent call last): File "./pytorch/train.py", line 306, in train ret_dict = net_parallel(example_torch) File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, **kwargs) File "/home/xxx/second.pytorch-master/second/pytorch/models/voxelnet.py", line 363, in forward preds_dict = self.network_forward(voxels, num_points, coors, batch_size_dev) File "/home/xxx/second.pytorch-master/second/pytorch/models/voxelnet.py", line 332, in network_forward voxel_features, coors, batch_size) File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, **kwargs) File "/home/xxx/second.pytorch-master/second/pytorch/models/middle.py", line 203, in forward ret = self.middle_conv(ret) File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, **kwargs) File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/site-packages/spconv/modules.py", line 130, in forward input = module(input) File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, **kwargs) File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/site-packages/spconv/conv.py", line 170, in forward grid=input.grid) File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/site-packages/spconv/ops.py", line 91, in get_indice_pairs stride, padding, dilation, out_padding, int(subm), int(transpose)) RuntimeError: /home/xxx/second.pytorch-master/spconv/src/spconv/indice.cu 125 cuda execution failed with error 2

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "./pytorch/train.py", line 663, in fire.Fire() File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/site-packages/fire/core.py", line 127, in Fire component_trace = _Fire(component, args, context, name) File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire component, remaining_args) File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable result = fn(*varargs, **kwargs) File "./pytorch/train.py", line 421, in train print(json.dumps(example["metadata"], indent=2)) File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/json/init.py", line 238, in dumps **kw).encode(obj) File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/json/encoder.py", line 201, in encode chunks = list(chunks) File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/json/encoder.py", line 428, in _iterencode yield from _iterencode_list(o, _current_indent_level) File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/json/encoder.py", line 325, in _iterencode_list yield from chunks File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict yield from chunks File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/json/encoder.py", line 437, in _iterencode o = _default(o) File "/home/xxx/.conda/envs/dwzpy/lib/python3.6/json/encoder.py", line 180, in default o.class.name) TypeError: Object of type 'ndarray' is not JSON serializable

zjuxiaobaiq avatar Mar 10 '20 03:03 zjuxiaobaiq

Did you manage to solve this?

andraspalffy avatar Mar 28 '20 16:03 andraspalffy

Make sure you have enough free memory on your GPU. I was facing this issue because my GPU was fully occupied. It started working when I cleared some memory.

Rajat-Mehta avatar Oct 27 '20 09:10 Rajat-Mehta

I also have the same issue. My model needs at most 16GB, and I am using 32G GPU RAM. Any idea as how to solve this issue?

hadihdz avatar Jun 09 '23 16:06 hadihdz

I met the same issue. Has anyone solved that?

Update: I solved this problem with spconv 1.2.1

lacie-life avatar Oct 28 '23 07:10 lacie-life