GRPO Qwen3 megatron training script

Open mpj1234 opened this issue 3 months ago • 2 comments

Can Qwen3 model be provided to use megatron to train GRPO algorithm at the back end?

Nov 04 '25 10:11 mpj1234

I don't know if you've already seen them but these examples might be helpful - https://github.com/volcengine/verl/tree/main/examples/grpo_trainer

You'll find some Qwen3 examples at the end.

Nov 08 '25 00:11 RitvikKapila

Thank you, I'll study.

Nov 24 '25 07:11 mpj1234