RuntimeError

Open 0317lwj opened this issue 1 year ago • 0 comments

Hellow，I ran the train.py file but encountered the following problem.How do I fix it?

dataset is HT21, images num is 5312 train/HT21-01 Using /home/liuwenjie/.cache/torch_extensions/py38_cu118 as PyTorch extensions root... Detected CUDA files, patching ldflags Emitting ninja build file /home/liuwenjie/.cache/torch_extensions/py38_cu118/_prroi_pooling/build.ninja... Building extension module _prroi_pooling... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) ninja: no work to do. Loading extension module _prroi_pooling... Traceback (most recent call last): File "train.py", line 313, in cc_trainer.forward() File "train.py", line 50, in forward self.train() File "train.py", line 86, in train all_loss.backward() File "/home/liuwenjie/anaconda3/envs/DRNet2/lib/python3.8/site-packages/torch/_tensor.py", line 487, in backward torch.autograd.backward( File "/home/liuwenjie/anaconda3/envs/DRNet2/lib/python3.8/site-packages/torch/autograd/init.py", line 200, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [4, 128, 96, 128]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Sep 20 '24 07:09 0317lwj