Gryff1ndor

Results 1 comments of Gryff1ndor

Thanks a lot! There is another problem which bothers me: When using the DPO loss in my work, I found that the sigmoid function in DPOloss caused gradient explosion, because...