DeepSpeed
DeepSpeed copied to clipboard
[REQUEST] Why con't Pipeline Parallelism do sequence parallel?
a little disappointed when I see assert tensor_model_parallel_size == 1 and pipeline_model_parallel_size == 1, \ 'DeepSpeed\'s sequence parallel does not work with tensor parallel or pipeline parallel'
mark