implicit icon indicating copy to clipboard operation
implicit copied to clipboard

Is ItemItemRecommender inconsistent?

Open sharthZ23 opened this issue 5 years ago • 4 comments

Hi, thanks for this nice package!

While working with the library I came across the following inconsistent behavior of ItemItemRecommender:

  1. ItemItemRecommender can generate less than N recommendations (while expected exactly N)
  2. ItemItemRecommender can add already liked items to recommendations, when flag filter_already_liked_items set to True

Such errors occur in all ItemItemRecommender variations: Cosine, TfIdf anb BM25. It is also interesting that the final distributions of recommendation lengths for each variation differ, as does the number of known interactions included in the recommendations.

Jupyter notebook with test data to reproduce problems - implicit_itemitem_bug_code.zip

sharthZ23 avatar Mar 30 '20 20:03 sharthZ23

Thanks for your bug report!

I'll check this soon

ita9naiwa avatar Mar 31 '20 03:03 ita9naiwa

Firstly, 1 is not a bug. KNN in implicit is an approximate version. Let K be the number of neighbors stored and N be recommendation size.

Suppose that K <= N. For the user A who clicked only a single item B, only the items that top-K similar items to item B can have similarity score higher than 0. therefore, only top K items can be recommended.

Similarly, some items do not share users. In that case, the similarities among these items are zero . Some users contain only items that have fewer interactions(thus even do not share any interaction with other items)

ita9naiwa avatar Apr 23 '20 13:04 ita9naiwa

Secondly,

def intersection_mapper(row):
    row1 = row['item_id']
    row2 = row['item_id_known']
    return len(np.intersect1d(row1, row2))

def intersection_mapper(row):
    row1 = set(row['item_id'])
    row2 = set(row['item_id_known'])
    return len(row1.intersection(row2))

Second version of intersect_mapper works correctly.(but I don't know why np.intersect1d does not work)

ita9naiwa avatar Apr 23 '20 14:04 ita9naiwa

On the first point, I realized that these are limitations of the algorithm itself. It would still be nice to issue a warning in this situation, but this is secondary.

For the second point, I didn't understand your position, because your variation of intersection_mapper works identiсal to mine and there is still error with liked items in recommendations. My version: image Your version: image

The final tables are identical and show, that there are many already liked items in recommendations.

sharthZ23 avatar Apr 23 '20 14:04 sharthZ23