TP-based metrics fail to compute correct values with duplicates in interactions

Open blondered opened this issue 1 year ago • 1 comments

What happened?

Precision (and other Calssification and Ranking metrics) is not correct when there are duplicates in test interactions. See screenshot

Expected behavior

I suppose that correct value should be 1/3 because 2/3 of reco lists are False Positives.

Relevant logs and/or screenshots

Operating System

macOS Big Sur 11.6 (Apple M1)

Python Version

any

RecTools version

0.6.0

Jul 17 '24 11:07 blondered

Hi. I just ran into the same issue. Any updates/thoughts on this?

Reason for this behavior is in code below https://github.com/MobileTeleSystems/RecTools/blob/46deae3c5cfc923f164302525a853f9a3f918503/rectools/metrics/base.py#L97-L102

Here interactions are merged with recos, so if former contains duplicated interactions they all count as TP, resulting in error in metric. First idea is to drop duplicates after merge, so we don't count same interaction several times as TP.

Jul 29 '25 14:07 ddbelkov