Fix function cos_similarity
If dimension of input is (1,N) or (N,1), cos_similarity was not calculated.
Could you add an axis argument? I think this is one of the functions that we translated from MATLAB and never changed to numpy style.
Default could be -1. I think no one uses col or row vectors, or at least no one should use col or row vectors, in numpy we use just vectors.
Codecov Report
Patch and project coverage have no change.
Comparison is base (
42e0c62) 65.44% compared to head (a2ef5af) 65.44%.
:exclamation: Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.
Additional details and impacted files
@@ Coverage Diff @@
## master #90 +/- ##
=======================================
Coverage 65.44% 65.44%
=======================================
Files 80 80
Lines 5394 5394
=======================================
Hits 3530 3530
Misses 1864 1864
| Impacted Files | Coverage Δ | |
|---|---|---|
| paderbox/math/vector.py | 69.23% <0.00%> (ø) |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
In the cos similarity function, the absolute value is used in the denominator. So for the case [1,0] [-1,0] the similarity would not be -1 but 1, which conflicts with the definition of cos similarity?
There are different definitions for the cosine similarity and the cosine distance. Depending, what you do, you use a different definition. When we wrote those two function, we discussed, which implementation we should use and for some reason, we decided to use the current split. (I forgot the arguments)
When you search for the definition of cosine similarity, you find the definition for real valued numbers without conj and without abs inside, so the old implementation of "cos_distance" was correct, according to this definition.
When you work with complex number, you have to fix the denominator, but the number would still be complex. As far as I know, the abs is usually used, when you work with complex numbers and cosine dist or similarity.
Since you changed the implementation to use abs, could you give a motivation/use case for abs?
@TCord What do you usually use?
While I am fine with a breaking change to remove support for column vectors (shouldn't cause any trouble), we should be careful with a breaking change that introduces an abs.
I would vote against making a breaking change, here. We should keep the behavior of the function (value range between 0 and 1, but no usage of an absolute value) identical, and instead add an axis argument and an assertion that the output will be valid.