GPT-MoE supports for expert parallel

Open YJHMITWEB opened this issue 2 years ago • 0 comments

Hi, I am wondering if https://github.com/NVIDIA/FasterTransformer/blob/main/docs/gpt_guide.md#gpt-with-moe example provided has the support for expert parallel. The provided examples are using nlp_gpt3_text-generation_0.35B_MoE-64, but there are only tensor parallel and pipeline parallel options.

Since in the Swin-Transformer-Quantization folder, FasterTransformer is using the Swin-MoE repo which supports expert parallel, I'd like to know how to enable this feature for GPT-MoE as well.

Aug 15 '23 09:08 YJHMITWEB