JetMoE
JetMoE copied to clipboard
Pretraining dataset and code request
Will the pretraining datasets and corresponding code be open-sourced?
Thanks!
Hi, thanks for the great work. I'd also be interested particularly in training code or at least if you can share some multi-node settings details. What tech did you use for parallelization across GPU nodes?
Same. Looking forward to the open-source training code and details.
Thanks for the amazing work and the paper. Really would love to explore your training code.
+1
https://huggingface.co/jetmoe/jetmoe-8b/discussions/5