RecTools icon indicating copy to clipboard operation
RecTools copied to clipboard

implicit BPR-MF wrapper model

Open blondered opened this issue 1 year ago • 3 comments

Feature Description

Create a wrapper for popular BPR-MF baseline

Why this feature?

It's easy to implement and it's often used as a baseline in research

Additional context

No response

blondered avatar May 14 '24 09:05 blondered

Created an experimental PR. https://github.com/MobileTeleSystems/RecTools/pull/232

I know there are several discussion points, but I hope it clarifies the desired specs.

Currently, the PR excludes training with features.

I think support fit with features for GPU is easier than CPU because preparation before implicit.gpu.bpr_update() is written in Python https://github.com/benfred/implicit/blob/b33b809cb585cb8a65ad39d0f97497d37e98acaa/implicit/gpu/bpr.py#L137 while implicit.cpu.bpr implements in Cython. https://github.com/benfred/implicit/blob/b33b809cb585cb8a65ad39d0f97497d37e98acaa/implicit/cpu/bpr.pyx#L137-L187

chezou avatar Dec 12 '24 01:12 chezou

@chezou thank you so much for your contributions! I've merged PR with BPR model.

I wouldn't focus on adding features for BPR since it's a complicated task and will take a lot of time. BPR itself is one of the most known baselines. But since we already have an MF algorithm with features we don't necessarily need another one

Right now in RecTools we are focusing on adding features and providing maximum customisation to transformer models (SASRec and BERT4Rec). They can be found here: https://github.com/MobileTeleSystems/RecTools/blob/experimental/sasrec/examples/tutorials/transformers_tutorial.ipynb And we are preparing them for the release.

Another big story right now is CandidateRankingModel which uses baseline models to generate candidates and then uses Gradient Boosting to rerank them. https://github.com/MobileTeleSystems/RecTools/blob/experimental/two_stage/examples/tutorials/candidate_ranking_model_tutorial.ipynb

As for the baselines, we still need SLIM model. It is well known both for quality and efficiency. And I do have a feeling that we are missing it in the framework right now. We also have an issue for it: https://github.com/MobileTeleSystems/RecTools/issues/103 We have found SLIM to be quite useful in production for fast MVPs and candidate generation, but feature selection (as in the paper) is a must otherwise it's learning too slow. Inference for SLIM could be done with implicit framework ItemKNN inference code which is optimised already. And SLIM has sparse item-item similarity matrix under the hood so inference logic is the same.

blondered avatar Jan 13 '25 09:01 blondered

Thanks for the review and comment. I don't have any preference for having the feature for BPR, so feel free to close this issue.

I briefly researched SLIM (specifically, SLIM ElasticNet) and found that many frameworks use this CPU-based implementation: https://github.com/MaurizioFD/RecSys2019_DeepLearning_Evaluation/blob/master/SLIM_ElasticNet/SLIMElasticNetRecommender.py, or some use a PyTorch-based one.

Anyway, let's continue to discuss on #103

chezou avatar Jan 13 '25 15:01 chezou