DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

MoE - Token dropping for Full Tensor Paralellism

Open siddharth9820 opened this issue 3 years ago • 0 comments

This PR enables token dropping for full tensor parallelism. Also corrects timers.

(Still WIP)

siddharth9820 avatar Aug 18 '22 07:08 siddharth9820