Ethan
Results
2
issues of
Ethan
**Describe the bug** I reviewed the initialization of self.gradient_accumulation_steps in the DeepSpeedConfig module when only train_batch and micro_batch are set (deepspeed Version: 0.13.1): ```python grad_acc = train_batch // micro_batch grad_acc...
bug
training
Hi everyone, I'm currently working on a project involving Megatron-LM and I'm looking for a way to obtain the graphs (computation graphs) of sub-models after partitioning, along with the attributes...