noob-ctrl

Results 15 comments of noob-ctrl

@tgale96 After I run `dmoe_46m_8gpu.sh` script, The saved model is in the following format, with a `model_optim_rng.pt ` in each folder: ![image](https://github.com/stanford-futuredata/megablocks/assets/63763578/6a6a7af4-3332-4e62-96b8-85cfa59d5471) I want to merge this weights into a...

@tgale96 ![image](https://github.com/stanford-futuredata/megablocks/assets/63763578/59e8f200-26f6-4436-8309-3e5c2eb87cba)

@ShinoharaHare Hi, have you solved this problem?

@laixinn I deployed the model service on 2 H20s. After deploying according to the command you showed, an error message was displayed when requesting the API. The error info as...

@laixinn I solve this problem,thanks. In addition, I would like to ask if there is any comparative experimental data for Deepseek-R1-FP8 model?