Le, Jiang
Le, Jiang
目前开源版本DistributedDense只支持按列拆分这种pattern,但预留了实现不同拆分pattern的接口。 我们WIP版本已经支持不同拆分pattern,并能够基于预定义cost model评估模型选择不同拆分pattern时的开销,来自动探索出较优的拆分plan,这部分工作正在完善中。
Any progress? I also encounter the same issue > ``` > **Your question** > Ask a clear and concise question about Megatron-LM. > /workspace/megatron/megatron/core/models/gpt/gpt_layer_specs.py:77: UserWarning: The fp8 argument in "get_gpt_layer_with_transformer_engine_spec"...
msccl-allreduce leads to less comm overhead than nccl-allreduce, Do we have any plans to involve this implementation? https://github.com/sgl-project/sglang/commit/8e3797be1ca9e3f0c68ff53c86e363bbfeffa268,