mirrorboat
Results
1
issues of
mirrorboat
### System Info 在RayPPOTrainer中,_balance_batch会打乱batch的顺序,似乎会导致执行_log_rollout_data 时reward_extra_infos_dict顺序和batch不一致。 注:读代码时发现的潜在问题,尚未尝试复现 ### Information - [ ] The official example scripts - [ ] My own modified scripts ### Tasks - [ ] An officially supported...
bug