Nicolas Hug
Nicolas Hug
Thanks for opening this issue @msaroufim ! On top of model training time and accuracy, I think we'll also want to monitor the time for the DataLoader to yield an...
> I spent a lot of time porting the torchvision training references to use datapipes. I don't think they're suitable for the kind of benchmark we want to do here...
Some basic results, which are consistent with what I had a few months ago: Benchmarking `mobilenet_v3_large` (io bound) from the torchvision training references (https://github.com/pytorch/vision/pull/6196) on the AWS cluster, distributed over...
@msaroufim @VitalyFedyunin @NivekT following up on my earlier comments in https://github.com/pytorch/data/issues/416#issuecomment-1164404834 I also have a separate PR (https://github.com/pytorch/vision/pull/6196) that already provides support for **the cross-product** of: - Distributed Learning (DDP)...
Thanks for the report, https://github.com/pytorch/pytorch/pull/61761 should provide a fix
Thanks for the report, https://github.com/pytorch/pytorch/pull/61761 should provide a fix
Sounds good, I marked the PR as draft so we don't merge it accidentally. Please ping me when it's ready. BTW, as a follow up to your previous issue: https://github.com/pytorch/hub/issues/224...
Yes there's a GPU in the CI. I agree we should try to keep the CI test time within a reasonable limit though.
> I don't think I recently have time for making the changes There's no rush on our side @PeterL1n Without tests, we have no guarantees that the models we are...
Done in https://github.com/pytorch/hub/pull/281