CRAFT-Reimplementation icon indicating copy to clipboard operation
CRAFT-Reimplementation copied to clipboard

train error

Open Jlpz opened this issue 6 years ago • 5 comments

Good job! The code can be trained on Syntext. However, the following error occurred when i run trainic15data.py on 4 gpus. I try to address this problem, and find a solution, https://blog.csdn.net/loopun/article/details/89295454. I train it on 1 gpus, but it is not work, and the error still exist. Anyone give me a suggestion? Thank you.

Traceback (most recent call last): File "trainic15data.py", line 183, in loss.backward() File "/root/userfolder/anaconda3/envs/craft_pytorch/lib/python3.5/site-packages/torch/tensor.py", line 93, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/root/userfolder/anaconda3/envs/craft_pytorch/lib/python3.5/site-packages/torch/autograd/init.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: invalid argument 3: divide by zero at /pytorch/aten/src/THC/generic/THCTensorMathPairwise.cu:88

Jlpz avatar Nov 06 '19 13:11 Jlpz

@Jlpz Sorry, I am not clear this error. Maybe you can join our weChat group.https://github.com/backtime92/CRAFT-Reimplementation/issues/26#issue-512851949

backtime92 avatar Nov 06 '19 15:11 backtime92

@Jlpz You can scan directly QR code.

backtime92 avatar Nov 07 '19 02:11 backtime92

@backtime92 where is the QR code? i'm willing to join too.

kouxichao avatar Nov 08 '19 12:11 kouxichao

hello,i meet the same problem,do anyone know the solution?@Jlpz @backtime92

pingzi5233 avatar Jun 03 '20 03:06 pingzi5233

i may solve this problem using set a small iter saving when i train with synth.pth,and then i continue training with this small iter model,i didn't see the problem again.

pingzi5233 avatar Jun 04 '20 02:06 pingzi5233