David Thorsley
Results
2
issues of
David Thorsley
**Describe the bug** Phi-3-mini-4k-instruct does not run with Automodel. I've tried two cases: 1. using `nemo_automodel.NeMoAutoModelForCausalLM.from_pretrained` as the model `_target_` in the config. In this case I get the error...
bug
Adds support for greedy expert selection to the MoE layers, which will allow the DeepSeek V3 model to run DeepSeek V2 checkpoints. Label r0.1.0 added to request a cherry-pick if...
r0.1.0