streaming icon indicating copy to clipboard operation
streaming copied to clipboard

Does it support Preference data (for training Reward / DPO)?

Open airlsyn opened this issue 1 year ago • 4 comments

🚀 Feature Request

The preference data looks like this:

{
    "chosen":
    [
        {"role": "user", "content": "abcd"},
        {"role": "assistant",  "content": "abcef"},
        ...
    ],
    "rejected":
    [
        {"role": "user", "content": "abcd"},
        {"role": "assistant", "content": "abcef"},
        ...
    ]
}

This data is used to train a Reward Model or DPO

I'm wondering if it's possible to use streaming for this kind of situation. And How? Thanks very much.

airlsyn avatar Apr 17 '24 08:04 airlsyn

Yes it should work out of the box. Use MDSWrite to convert your preference data to MDS , and create a streaming dataset out of it. Use json as the encoding method. Let me know if you see any issue.

XiaohanZhangCMU avatar Apr 17 '24 14:04 XiaohanZhangCMU

Yes it should work out of the box. Use MDSWrite to convert your preference data to MDS , and create a streaming dataset out of it. Use json as the encoding method. Let me know if you see any issue.

Thank you for quickly explain, I'll try it.

airlsyn avatar Apr 18 '24 01:04 airlsyn

@ericxsun Wondering, have you tried the @XiaohanZhangCMU suggestion? Did it work?

karan6181 avatar Jun 20 '24 15:06 karan6181

Hi @ericxsun want to follow up here before closing this issue.

XiaohanZhangCMU avatar Sep 05 '24 22:09 XiaohanZhangCMU