Pipeline io.
Currently, DatasetFolder load images and transform them in the same thread, so the latency is io latency + transform latency. It is better if we load images in another thread, so the latency is max(io, transform).
some test results:
python3 main.py -a resnet18 -b 256 -j 8 /imagenet --gpu 0
- base: 73.4s (sum of the data part of
Epoch [0]:results) - pipelined: 198.47s
CUDA_VISIBLE_DEVICES="0,1,2,3" python main.py -a resnet18 -b 1024 -j 16 /data/PyTorchRawImage/
- base: 913s
- pipelined: 580s
See also the coupled pr in https://github.com/pytorch/pytorch/pull/28199 .
Codecov Report
Merging #1482 into master will decrease coverage by
0.24%. The diff coverage is23.68%.
@@ Coverage Diff @@
## master #1482 +/- ##
==========================================
- Coverage 64.46% 64.21% -0.25%
==========================================
Files 83 83
Lines 6421 6458 +37
Branches 982 989 +7
==========================================
+ Hits 4139 4147 +8
- Misses 1992 2020 +28
- Partials 290 291 +1
| Impacted Files | Coverage Δ | |
|---|---|---|
| torchvision/datasets/folder.py | 62.6% <23.68%> (-19.45%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 53b062c...b4bd6b2. Read the comment docs.
Thanks for the PR!
Can you give more context on why this change is needed, and what it is trying to achieve?
Also, opening an issue before sending a PR adding a new functionality is generally the recommended way of proceeding, so that we can discuss beforehand about the needs of such functionality.
Hi, I have updated the pr message to give the context for this.
@fmassa could you please have a look at this pr ?
I think this adds a lot of complexity on something that is supposed to be simple, and the interactions with the multi-threaded DataLoader is not clear to me.
Even if this improves performance, I would rather not have it merged in torchvision. Indeed, one could ask why such an approach is not implemented for the other datasets.