panml

panml copied to clipboard

Reame
Issues

Idea: look into DPO for model tuning

Open wanoz opened this issue 2 years ago • 1 comments

May 31 '23 22:05 wanoz

This is about using direct preference optimisation:

https://arxiv.org/abs/2305.18290

May 31 '23 22:05 wanoz