Mutli gpu training
Hi currently training on GTA2Cityscapes takes 2 days for 100k epochs which is very slow. How can I make this run in multi gpu?
If you do mean 100k EPOCHes, it is not slow dude. Try nn.dataparallel(model) to run on multiple GPU and you can find tutorial here https://pytorch.org/tutorials/beginner/former_torchies/parallelism_tutorial.html
Oops, I meant 100k iterations! Thanks for the link, will try out and update.
I have a image computing PC with cpu(16g) and gpu(8g) ,is it enough to train the model without throwing a CUDA out of Memory error,please?
I forgot details, but cityscapes dataset usually required 11GB GPU based on my experience.
thanks anyway ,although it is a bad news
@kshitijagrwl hi, Have you completed the multi-GPU version?
have anyone tried multi-GPU version? I want to train with multi-GPU. please provide the way to train multigpu.
@lerndeep I'm trying it now, running into a few bugs (fairly new to PyTorch). Will update here if/when I get it working
@kshitijagrwl @lychrel Are you finishing the multi gpu computing? Looking forward to your reply!
@Lufei-github Tried it a couple times and couldn't avoid a memory leak that reboots my computer. I don't have this problem elsewhere, even in similar contexts (DeepLab)—but this is also a super simple training loop, so the culprit shouldn't be hard to find.
Ended up using different DA methods for the project I was working on, but I'd be curious to hear if anyone else experiences this behavior. Though I switched to a different problem, ASN gave really compelling results after letting the single-GPU jobs run.
@lychrel I don't really understand your answer. I don't kow what is ASN. So can you answer me with a simply way?