xulangping
xulangping
same problem
while use 7b, there is no problem , but when use 13b, same problem for 00005.bin, maybe oom? @ssemeniuta @thashim
Loading checkpoint shards: 67%|████████████ | 4/6 [00:27
> For the step2 scoring: `python3 training/step2_reward_model_finetuning/rw_eval.py --model_name_or_path output/reward-models/350m/ ==================Eval result============================ prompt: Human: Please tell me about Microsoft in a few sentence? Assistant: > > good_ans: Microsoft is a software...