Yicheng Zou
Results
3
issues of
Yicheng Zou
The toolkit is really amazing and convenient. I wonder how it can work for non-English languages.
添加集录祈愿
1
4.5版本更新新增了集录祈愿
Hello OpenRLHF Team, We have encountered a significant issue with reward instability and a lack of reproducibility during RL sampling (PPO) when using **vLLM v0.8.3**. Our experiments show that **vLLM...