ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[autoparallel] Draft for mix gather

Open zhang677 opened this issue 3 years ago • 1 comments

What's new? Add a one-step transformation called mix-gather for:

Src Dst
S0S1 RR
S1S0 RR
S01R RR
RS01 RR

Why do we need this? Reduce the communication cost. Assume $\beta_1 \gt \beta_0$, $M$ is the communication size. Cost for S0S1=>S0R=>RR is $\frac{M}{n_1n_0}\times\frac{n_1-1}{n_1}\times\beta_1 + \frac{M}{n_1}\times\frac{n_0-1}{n_0}\times\beta_0$ Cost for S0S1=>RR is $\frac{M}{n_0n_1}\times\frac{n_0n_1-1}{n_0n_1}\times\beta_1$

Pitfalls Peak memory increases by the size of a tensor for S0S1=>RR and S1S0=>RR

zhang677 avatar Nov 18 '22 02:11 zhang677

Could you please push the unit test file for this feature?

YuliangLiu0306 avatar Nov 18 '22 03:11 YuliangLiu0306