noob-ctrl
noob-ctrl
I also encountered this problem
@Orion-Zheng Has this bug been solved?
I also encountered this problem
@stas00 Hi, it works now. Thank you!
@ethanhe42 When `transformer-impl` is `local`, it reports the following error: ```AssertionError: (RMSNorm) is not supported in FusedLayerNorm when instantiating FusedLayerNorm when instantiating TransformerLayer``` When `transformer-impl` is `transformer_engine`, the following code...
@ethanhe42 When `transformer-impl` is set to `transformer_engine`, the following code does not seem to define RMSNorm? 
@lleizuo Hello, have you solved this problem?
@c-maxey Sorry, I still don't quite get it, can you show me a simple example?
@tgale96 OK,dMoE can work. Does megablocks only support data parallelism and expert parallelism? Does that mean it’s impossible to train a larger model?
@tgale96 When I want to merge the weight of model,I encountered the following problem:  My script is: ``` python tools/checkpoint_util.py \ --model-type GPT \ --load-dir /gpudisk1/openmoe/data \ --save-dir /gpudisk1/openmoe/data/he_checkpoint...