Enable AutoAugment and modernize DALI pipeline for ConvNets

Open klecki opened this issue 2 years ago • 0 comments

Update DALI implementation to use modern "fn" API instead of old class approach.

Add a codepath using AutoAugment in DALI training pipeline. It can be easily extended to use other Automatic Augmentations.

You can read more about DALI's support of Automatic Augmentations here: https://docs.nvidia.com/deeplearning/dali/user-guide/docs/auto_aug/auto_aug.html

The integration of DALI Pipeline with PyTorch additionally skips the transposition when exposing NHWC data.

Extract DALI implementation to a separate file. Update the readme and some configuration files for EfficientNet:

dali-gpu is the default data-backend, instead of PyTorch
DALI supports AutoAugment (+ a mention of other Automatic Augmentations)

Fix a typo in the readme files: --data-backends -> --data-backend

This PR is a backport of the changes made to this example, when it was introduced into DALI codebase: https://github.com/NVIDIA/DALI/tree/main/docs/examples/use_cases/pytorch/efficientnet

The changes were tested with the smallest EfficientNet only.

The usage of DALI GPU pipeline in the training can remove the CPU bottlneck and improve GPU utilization on both DGX-1V and DGX-A100 when running with AMP which was covered in this blogpost: https://developer.nvidia.com/blog/why-automatic-augmentation-matters/

Please note, that in the DALI's example we reduced the number of worker threads to half of what is currently setup for PyTorch. This change was not reflected in this PR - optimal default of worker threads for different data-backends is not the same, so it can be set conditionally, I don't know what would be the recommended way to do it.

Aug 28 '23 16:08 klecki