Megatron-LM icon indicating copy to clipboard operation
Megatron-LM copied to clipboard

[ENHANCEMENT] Do you have a plan that supports Mixtral 8x7B?

Open matrixssy opened this issue 2 years ago • 7 comments

matrixssy avatar Jan 04 '24 01:01 matrixssy

The same question.

cdj0311 avatar Jan 14 '24 11:01 cdj0311

The same question.

I am working for it, but I am not sure if it will be accepted.

matrixssy avatar Jan 15 '24 01:01 matrixssy

@matrixssy any progress ? thank you.

bityigoss avatar Jan 17 '24 07:01 bityigoss

see https://github.com/NVIDIA/Megatron-LM/pull/667

matrixssy avatar Jan 18 '24 12:01 matrixssy

Hi, Please refer to this script for MoE/Mixtral training.

yanring avatar Feb 15 '24 01:02 yanring

Hi, Please refer to this script for MoE/Mixtral training.

Great! But I would like to know how to convert Hugging Face (HF) weights to Megatron (MG) format, and if it's possible to convert them back after training?

matrixssy avatar Feb 17 '24 03:02 matrixssy

Marking as stale. No activity in 60 days.

github-actions[bot] avatar Apr 17 '24 18:04 github-actions[bot]