RLHF-Reward-Modeling
RLHF-Reward-Modeling copied to clipboard
Code to reproduce ArmoRM
Hi, this is great work and I'd like to know if there is a plan to release the training code to reproduce the model?
Sorry for the delay. I try to release it this month.
what about the moe for the calculation of the coefficients?
Will release the code this week!
Hi, Haoxiang, when will the code for ArmoRM be released?
Code released!