xHansonx

Results 2 issues of xHansonx

Based on the results you showed in gits, it looks like losses from training and valid is diverging instead of converging as number of epoch increasing. So, what's the problem...

### 🐛 Describe the bug Code: ------------------------------------------------------------ torchrun --standalone --nproc_per_node=1 train_reward_model.py --dataset Dahoas/rm-static --subset ../../../datasets/Dahoas_rm-static --max_len 512 --model gpt2 --pretrain ../../../gpt2/gpt2-small --lora_rank 0 --max_epochs 1 --batch_size 1 --loss_fn log_sig --test...

bug