YANG XINYU

Results 1 issues of YANG XINYU

I reimplemented dpo in my own dataset and model, and the model tends to generate answers of increasing length: ![image](https://github.com/eric-mitchell/direct-preference-optimization/assets/46637504/a3be8de3-ff4f-479d-9f94-e14fa143bd72) Finally model generates long answers with repeated tokens. The same...