BlueRum

Results 37 comments of BlueRum

> This [picture](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/chatgpt.png) is from InstructGPT, maybe there should be copyright information? Thanks for your remind, and we mention the reference here.

> > Hi, I found this [picture](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/experience.jpg) a little confusing. I think if there is some description, it would be much better. > > Yes, at least tell the readers:...

suggest to use ``` python -m pip install –upgrade pip ``` to update your pip

Thanks for your feedback, you have used DDP strategy which is too naive and costs much more GPU mem. You can try ``` torchrun --standalone --nproc_per_node 4 benchmark_gpt_dummy.py --model m...

> @ver217 , should the `dist=0` be outside the loop? yep, I will fix soon.

Thank you for your feedback, and sorry about late reply. And in /applications/ChatGPT)/examples/ ,we have 3 examples : train_dummy -> show the vanilla way to start **training step 3**. train_prompts...

Because as we think, the rl training process here is a one-step process, which means there isn't a next_state.

I'll close this issue now, please reopen the issue if you have further questions.

Thank you for your feedback. We do not suggest to use loss to eval the training process in rm training task. It's shown in paper that the loss will be...

Hi @Qian0733 Thank you for your feedback. But we can't reproduce your bug. It seems like there's something wrong with your env. Can you give us more information about your...