verl icon indicating copy to clipboard operation
verl copied to clipboard

Support Training with Both Function-Based Reward and DPO Reward Simultaneously

Open lianghsun opened this issue 1 year ago • 0 comments

Hello,

I would like to confirm whether the current implementation supports training with both function-based reward and DPO reward simultaneously. If not, are there any planned updates or workarounds to achieve this?

Thank you!

lianghsun avatar Feb 06 '25 09:02 lianghsun