Yicheng Zou

Results 3 issues of Yicheng Zou

The toolkit is really amazing and convenient. I wonder how it can work for non-English languages.

4.5版本更新新增了集录祈愿

Hello OpenRLHF Team, We have encountered a significant issue with reward instability and a lack of reproducibility during RL sampling (PPO) when using **vLLM v0.8.3**. Our experiments show that **vLLM...