关于《深入浅出图神经网络：GNN原理解析》一书中chapter9代码的问题

Open ldq3 opened this issue 1 year ago • 0 comments

您好！我是一名深度学习的初学者，如果您发现我的问题中有一些显而易见的地方，也希望您不吝指出。

我的环境是：Python 3.12.2、torch 2.2.2+cu121

在我尝试运行 main.py 时，在117行处抛出异常： Exception has occurred: RuntimeError one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1682, 10]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck! File "D:\private\project\RS\script\main.py", line 117, in train loss.backward() # 反向传播计算参数的梯度 ^^^^^^^^^^^^^^^ File "D:\private\project\RS\script\main.py", line 152, in train() RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1682, 10]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck! 并出现以下错误信息： File "d:\private\project\RS\script\autoencoder.py", line 244, in forward v_outputs = self.activation(x_v) File "c:\Users\34635\AppData\Local\miniconda3\envs\data\Lib\site-packages\torch\nn\functional.py", line 1473, in relu result = torch.relu(input) (Triggered internally at ..\torch\csrc\autograd\python_anomaly_mode.cpp:118.) Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass

根据错误提示我理解的错误原因是 torch.relu() 作为一个 inplace operation 修改了张量，但我又了解到 torch.relu() 本身并不是一个 inplace operation，这令我感到费解。后续我尝试将 x_v 替换为 x_v.clone()，但没有解决问题。

Apr 02 '24 05:04 ldq3