Mark Nestor Costantini
Mark Nestor Costantini
Currently postfit symlinks are sorted only based on the first number of the replica folder eg 1, 10, 11, 12, 13, ..., 2, 20, ... This PR fixes this st...
addresses #2159 The main goals of this PR are: - multiclosure analysis modules are reviewed and rewritten based on the new findings of the inconsistent closure tests paper. - All...
TODO - [ ] still need to add multiclosure analysis documentation and review the old closure test analysis page - [x] Add link to paper once its on arXiv
# What does this PR do? Sets the default value of `mlp_impl` to `grouped` in `megablocks/layers/arguments/Arguments` dataclass. This is done to avoid the default dmoe script in `exp/dmoe/dmoe_46m_8gpu.sh` to crash...
Standard dmoe script crushes because default value of `mlp_impl` is set to `sparse` in `megablocks/layers/arguments.py/Arguments`. Running `exp/dmoe/dmoe_46m_8gpu.sh` crushes with: ``` [rank0]: File "/home/mark/codes/megablocks/megablocks/layers/arguments.py", line 82, in __post_init__ [rank0]: raise ValueError(...