paderbox Fix function cos

If dimension of input is (1,N) or (N,1), cos_similarity was not calculated.

Jul 20 '23 13:07 RautenbergFrederik

Could you add an axis argument? I think this is one of the functions that we translated from MATLAB and never changed to numpy style.

Default could be -1. I think no one uses col or row vectors, or at least no one should use col or row vectors, in numpy we use just vectors.

Jul 20 '23 13:07 boeddeker

Codecov Report

Patch and project coverage have no change.

Comparison is base (42e0c62) 65.44% compared to head (a2ef5af) 65.44%.

:exclamation: Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files

@@           Coverage Diff           @@
##           master      #90   +/-   ##
=======================================
  Coverage   65.44%   65.44%           
=======================================
  Files          80       80           
  Lines        5394     5394           
=======================================
  Hits         3530     3530           
  Misses       1864     1864

Impacted Files	Coverage Δ
paderbox/math/vector.py	`69.23% <0.00%> (ø)`

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

Jul 20 '23 14:07 codecov-commenter

In the cos similarity function, the absolute value is used in the denominator. So for the case [1,0] [-1,0] the similarity would not be -1 but 1, which conflicts with the definition of cos similarity?

Jul 20 '23 14:07 RautenbergFrederik

There are different definitions for the cosine similarity and the cosine distance. Depending, what you do, you use a different definition. When we wrote those two function, we discussed, which implementation we should use and for some reason, we decided to use the current split. (I forgot the arguments)

When you search for the definition of cosine similarity, you find the definition for real valued numbers without conj and without abs inside, so the old implementation of "cos_distance" was correct, according to this definition.

When you work with complex number, you have to fix the denominator, but the number would still be complex. As far as I know, the abs is usually used, when you work with complex numbers and cosine dist or similarity.

Since you changed the implementation to use abs, could you give a motivation/use case for abs?

@TCord What do you usually use?

While I am fine with a breaking change to remove support for column vectors (shouldn't cause any trouble), we should be careful with a breaking change that introduces an abs.

Jul 20 '23 15:07 boeddeker

I would vote against making a breaking change, here. We should keep the behavior of the function (value range between 0 and 1, but no usage of an absolute value) identical, and instead add an axis argument and an assertion that the output will be valid.

Jul 21 '23 12:07 TCord

Fix function cos_similarity

Codecov Report