nlopez94

Results 4 comments of nlopez94

Hi @kishwarshafin, The training process starts as expected with GPU activity visible, but it abruptly stops without any error message while processing the first epoch and determining the best checkpoint...

@kishwarshafin As I mentioned before, the same dataset and parameters were used when I ran this on CPU, and as I indicated earlier, this process continued without abruptly ending as...

@kishwarshafin I will try this and update you on the results I get. Thank you so much for the support!

@kishwarshafin I just found the error that was causing this to abruptly exit without warning. I was running the script with insufficient memory, and after changing my instance type, everything...