dimeldo

Results 12 comments of dimeldo

So how much storage size does it currently allocating for the snapshots? And what type of storage (eg: SSD, HDD)? Would really like to be able to set it myself.

Is 128 is the max_length of the entire conversation? So, you believe we should put our money on renting a powerful server like 8 Nvidia V100 for training on your...

Still not working... :( Reproducible with `python -m torch.distributed.launch --nproc_per_node=8 src/train.py --config src/configs/gpt2-dailydialog.json` on AWS p3dn.24xlarge (8 volta v100). The program just crashes... Works on 1 GPU tho.

Using this environment: https://docs.nvidia.com/deeplearning/frameworks/tensorflow-release-notes/rel_19.10.html

Hi @shamiul94! It's written in my notes that I used this command: ``` mpirun --allow-run-as-root -np 8 -H localhost:8 -bind-to none -map-by slot -x NCCL_DEBUG=INFO -x LD_LIBRARY_PATH -x PATH -x...

I can't quite remember. I think the 345M one. I can't remember if multi-GPU worked out alright in the end or not. Good luck in your research!

Is there a way you can implement that? I don't know how to do that myself.

1. You mean the "personality" key data should look like: `"personality": [""]` 2. What's the value of `personality_permutations` should be? @wise-east @sshleifer