noob-ctrl

Results 6 issues of noob-ctrl

The version information is as follows: - Deepspeed. 0.8.1 - transformers. 4.26.1 ## Problem When I use Trainer with Deepspeed, the Number of trainable parameters is 0. Like this: ![image](https://user-images.githubusercontent.com/63763578/225277324-3650bbea-78f7-493a-97a2-3f9ef0bdcd5a.png)...

If I want to use the new feature of Pytorch2.0——torch.compile, what should I do? Where should I put the following code or just pass a command line parameter? ``` model...

您好,我看代码中好像只进行了一次前向传播,那么x和正样例采用的dropout不是一样的吗?那二者的输出结果不就一样了吗?

Does Megatron-Core supports LLAMA models?

I I try to set the fp16 parameter to True and False respectively, why does the training time become longer when it is set to True?

I fllow the next step: - run docker build . -t megablocks-dev - and then bash docker.sh to launch the container. When I run `moe_46m_8gpu.sh` to test, it reported the...

question