StableLM icon indicating copy to clipboard operation
StableLM copied to clipboard

RLHF training code for StableVicuna open sourced?

Open REIGN12 opened this issue 2 years ago • 1 comments

Very exciting to see you guys' remarkable work on stablevicuna!! And I read through your blog and notice that all the dataset is open sourced and available; however, considering the training code part, the only mentioned details are that you are using trlx for training. So will there be any more detailed recipe or code for the RL tuning phase? Many thanks in advance and really appreciate your effort!!

REIGN12 avatar Apr 30 '23 09:04 REIGN12

WIP.

LouisCastricato avatar Apr 30 '23 16:04 LouisCastricato