tutorials icon indicating copy to clipboard operation
tutorials copied to clipboard

Training stuck at first epoch

Open RishikMani opened this issue 5 years ago • 3 comments

Hi Adam,

Thank you for the mask-rcnn library for upgraded tensorflow. Unfortunately, when I try to train, it always get stuck at Epoch 1/100 and no message gets displayed further. Is it because somewhere the verbose is set to 0? I waited for more than 4 hours but nothing ever seems to happen. Could someone help me with this?

RishikMani avatar Sep 04 '20 12:09 RishikMani

Hi Adam,

Thank you for the mask-rcnn library for upgraded tensorflow. Unfortunately, when I try to train, it always get stuck at Epoch 1/100 and no message gets displayed further. Is it because somewhere the verbose is set to 0? I waited for more than 4 hours but nothing ever seems to happen. Could someone help me with this?

I have the same problem while trying to train. Were you able to solve it somehow?

clmpng avatar Oct 14 '20 13:10 clmpng

Hi Adam, Thank you for the mask-rcnn library for upgraded tensorflow. Unfortunately, when I try to train, it always get stuck at Epoch 1/100 and no message gets displayed further. Is it because somewhere the verbose is set to 0? I waited for more than 4 hours but nothing ever seems to happen. Could someone help me with this?

I have the same problem while trying to train. Were you able to solve it somehow?

I was never able to get it resolved. That is why downgraded to tensorflow-gpu==1.14.0 and used the original Matterport Mask-RCNN repository.

RishikMani avatar Oct 15 '20 20:10 RishikMani

Is there any way to fix it using this repo?

st162053 avatar Mar 02 '22 15:03 st162053