Alex Su
Results
2
issues of
Alex Su
Hi, I am training my Llama2-7b model with Megatron-LM, using four H20s, 32 GPUs in total. The parallel strategy is set to: TP=8/PP=2/DP=2. Now, I want to know the data...