linfa icon indicating copy to clipboard operation
linfa copied to clipboard

Support for Hamming distance (l0 norm)

Open jorgehermo9 opened this issue 3 years ago • 2 comments

Currently, Lp distance when using p=0 is broken. It tries to calculate 1/0. I have implemented Hamming Distance (l0 norm) that counts the number of positions which have different values.

Also, I changed the distance tests to increase code readability and norm symmetry/homogeneity checks.

jorgehermo9 avatar Aug 01 '22 08:08 jorgehermo9

Codecov Report

Merging #233 (f3f9a3b) into master (870107f) will increase coverage by 0.08%. The diff coverage is 72.50%.

@@            Coverage Diff             @@
##           master     #233      +/-   ##
==========================================
+ Coverage   55.37%   55.45%   +0.08%     
==========================================
  Files          96       96              
  Lines        8918     8955      +37     
==========================================
+ Hits         4938     4966      +28     
- Misses       3980     3989       +9     
Impacted Files Coverage Δ
algorithms/linfa-nn/src/distance.rs 48.38% <72.50%> (+11.72%) :arrow_up:
algorithms/linfa-nn/src/lib.rs 50.00% <0.00%> (-7.15%) :arrow_down:
algorithms/linfa-nn/src/linear.rs 43.75% <0.00%> (-1.71%) :arrow_down:
...lgorithms/linfa-clustering/src/optics/algorithm.rs 48.53% <0.00%> (-0.29%) :arrow_down:
algorithms/linfa-nn/tests/nn.rs 78.04% <0.00%> (ø)
algorithms/linfa-kernel/src/lib.rs 60.21% <0.00%> (ø)
algorithms/linfa-logistic/src/lib.rs 69.55% <0.00%> (ø)
algorithms/linfa-pls/src/pls_generic.rs 69.64% <0.00%> (ø)
algorithms/linfa-svm/src/classification.rs 60.11% <0.00%> (ø)
src/dataset/mod.rs 88.96% <0.00%> (+0.34%) :arrow_up:
... and 4 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov-commenter avatar Aug 01 '22 08:08 codecov-commenter

IDK if it should be referred to as L0 norm, since the Wiki article states that Hamming distance is not technically a "real" norm. I think it'd be better to rename it to something like Hamming and leave the 0-case for Lp distance broken, since we don't technically support the real "L0" norm.

YuhanLiin avatar Aug 07 '22 01:08 YuhanLiin