[REQUEST] Example of H5 dataloader based training on azure VM for multi-node
Is your feature request related to a problem? Please describe. Deepspeed being a library for high speed training large model but most of the DL developers use Azure VMs and run multi-node training with their data being as H5 files. But there is no clear indication if H5 files are supported and how the training with deepspeed is being setup with H5 for multinode training. There is explanation on the communication time speed-ups when using multi-node training
Describe the solution you'd like An example of deepspeed training with data being stored as H5 files and used for training on Azure VMs under multi-node scenario
Examples like this will help in wider adoption and building efficient training pipeline using deepspeed