yechenzhi comments

Results 14 comments of


                                            yechenzhi

MIND模型效果复现

我复线的效果是：recall@50: 0.00469, ndcg@50: 0.00823, hitrate@50: 0.01147, epoch time: 12.55 s

MIND模型效果复现

删掉随机种子就可以了，想蹭个pr https://github.com/PaddlePaddle/PaddleRec/pull/789

DPO supports multi-device training

> Main question is around testing: are you able to kick off a proper distributed run and see decreasing losses here? Specifically one concern I have is around the interaction...

DPO supports multi-device training

> Out of curiosity, are you able to increase the per-device batch size beyond 4 on your RTX 4090s? Yes, I set per-device batch size to 6 now. > And...

DPO supports multi-device training

> I may be missing something obvious here though, if so please let me know. I know you mentioned filtering out longer examples from the training set which I did...

DPO supports multi-device training

I'll rerun my recipes to see if there are any issues with them.

DPO supports multi-device training

> I'll rerun my recipes to see if there are any issues with them. Here is the code I have written for visualizing loss curves: ``` import numpy as np...

DPO supports multi-device training

> I know you mentioned some data filtering up front in [this comment](https://github.com/pytorch/torchtune/pull/645#issuecomment-2047915895), are you doing that here as well? Yes, I applied data filtering in both TRL and Torchtune....

Will it support DPO(direct preference optimization) or other RLHF methods?

I am currently exploring various alignment techniques for tuning LLMs using [TRL](https://github.com/huggingface/trl). I've noticed that Torchtune appears to run more quickly, although I haven't conducted a comprehensive comparison of the...

Will it support DPO(direct preference optimization) or other RLHF methods?

@kartikayk Hello, I wrote a brief [RFC](https://github.com/pytorch/torchtune/issues/623#issue-2217054688) that includes preliminary pseudo-code, which is more of a conceptual framework at this stage. Please let me know if further pseudo-code is required.