shivamsbatra
shivamsbatra
facing same with 0.31.0
Hi @rnyak colleague of @Satwato here, >looks like it is stuck in the evaluation step? It didn't get stuck in evaluation as we had set the flags so that there...
In Short: 1) we get OOM on setting batch_size > 4 for longer training duration i.e. full data training. ( 4 on using multi-gpu, 12 on using single gpu for...
traceback for OOM, on max_steps training: ``` 51%|█████ | 60300/117800 [1:25:43
> are you still using NVTabular to transform your data? If NO, how did you create your schema file? Yes, I am using NVTabular's re-written data for training now, earlier...