DeepLearningExamples icon indicating copy to clipboard operation
DeepLearningExamples copied to clipboard

[Bert/Pytorch] Guide to start container for BERT pretraining with enroot + pyxis

Open mscherrmann opened this issue 2 years ago • 0 comments

Hey,

I try to use your framework to pretrain a BERT model from scratch. I have only access to powerful GPUs that are within a SLURM cluster, which means I have to work with enroot and pyxis to setup the container.

The cruical steps in your "Quick Start Guide" are steps 3. and 4.: 3. Build BERT on top of the NGC container. bash scripts/docker/build.sh

  1. Start an interactive session in the NGC container to run training/inference. bash scripts/docker/launch.sh

Do you have any hint how to handle that in my case?

Thank you very much in advance!

mscherrmann avatar Jul 25 '23 15:07 mscherrmann