ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

train_reward_model loss is random

Open ipackhu opened this issue 2 years ago • 1 comments

use default rm_static dataset, set train_data to 75000 batch_size 4 machine: 8 A100 80g after 13hours train loss is random change the training seems no problem

base model: opt-iml-max-1.3b image image image

why?

ipackhu avatar Feb 25 '23 00:02 ipackhu

Thank you for your feedback. We do not suggest to use loss to eval the training process in rm training task. It's shown in paper that the loss will be 0.4~0.7. We will update evaluating with acc & distance of pro-neg-pairs soon.

ht-zhou avatar Mar 02 '23 09:03 ht-zhou

We have updated a lot. This issue was closed due to inactivity. Thanks.

binmakeswell avatar Apr 26 '23 07:04 binmakeswell