Liger-Kernel
Liger-Kernel copied to clipboard
Add TVD (Total variation distance) Kernel
🚀 The feature, motivation and pitch
TVD is a good distance metric (ref) and easy to implement kernel to make the gradient more stable compared to KL divergence and JS Divergence.
Alternatives
No response
Additional context
No response
I'll look into it over the week if noone else takes.
#take @ByronHsu @qingquansong , I’d like to make an attempt. Could you please assign it to me?
assigned to you. Thanks!