noob-ctrl

Results 15 comments of noob-ctrl

@ethanhe42 When `transformer-impl` is `local`, it reports the following error: ```AssertionError: (RMSNorm) is not supported in FusedLayerNorm when instantiating FusedLayerNorm when instantiating TransformerLayer``` When `transformer-impl` is `transformer_engine`, the following code...

@ethanhe42 When `transformer-impl` is set to `transformer_engine`, the following code does not seem to define RMSNorm? ![image](https://github.com/NVIDIA/Megatron-LM/assets/63763578/d72e9f29-2dbf-48c9-93b4-5e71ae4cfe4f)

@lleizuo Hello, have you solved this problem?

@c-maxey Sorry, I still don't quite get it, can you show me a simple example?

@tgale96 OK,dMoE can work. Does megablocks only support data parallelism and expert parallelism? Does that mean it’s impossible to train a larger model?

@tgale96 When I want to merge the weight of model,I encountered the following problem: ![image](https://github.com/stanford-futuredata/megablocks/assets/63763578/5b90fff7-822a-490e-b790-217f9eb8b4e4) My script is: ``` python tools/checkpoint_util.py \ --model-type GPT \ --load-dir /gpudisk1/openmoe/data \ --save-dir /gpudisk1/openmoe/data/he_checkpoint...