[dkpd] what's the motivation?

Open lllyyyqqq opened this issue 1 year ago • 0 comments

I've seen the dkpd paper, the experiment results show dkpd works, but I don't really see why implement dpo to KD in the first place, and how it should improve the traditional kld or reverse kld method. Can you explain that to me?

Dec 31 '24 02:12 lllyyyqqq