implicit BPR-MF wrapper model
Feature Description
Create a wrapper for popular BPR-MF baseline
Why this feature?
It's easy to implement and it's often used as a baseline in research
Additional context
No response
Created an experimental PR. https://github.com/MobileTeleSystems/RecTools/pull/232
I know there are several discussion points, but I hope it clarifies the desired specs.
Currently, the PR excludes training with features.
I think support fit with features for GPU is easier than CPU because preparation before implicit.gpu.bpr_update() is written in Python https://github.com/benfred/implicit/blob/b33b809cb585cb8a65ad39d0f97497d37e98acaa/implicit/gpu/bpr.py#L137 while implicit.cpu.bpr implements in Cython. https://github.com/benfred/implicit/blob/b33b809cb585cb8a65ad39d0f97497d37e98acaa/implicit/cpu/bpr.pyx#L137-L187
@chezou thank you so much for your contributions! I've merged PR with BPR model.
I wouldn't focus on adding features for BPR since it's a complicated task and will take a lot of time. BPR itself is one of the most known baselines. But since we already have an MF algorithm with features we don't necessarily need another one
Right now in RecTools we are focusing on adding features and providing maximum customisation to transformer models (SASRec and BERT4Rec). They can be found here: https://github.com/MobileTeleSystems/RecTools/blob/experimental/sasrec/examples/tutorials/transformers_tutorial.ipynb And we are preparing them for the release.
Another big story right now is CandidateRankingModel which uses baseline models to generate candidates and then uses Gradient Boosting to rerank them. https://github.com/MobileTeleSystems/RecTools/blob/experimental/two_stage/examples/tutorials/candidate_ranking_model_tutorial.ipynb
As for the baselines, we still need SLIM model. It is well known both for quality and efficiency. And I do have a feeling that we are missing it in the framework right now. We also have an issue for it: https://github.com/MobileTeleSystems/RecTools/issues/103
We have found SLIM to be quite useful in production for fast MVPs and candidate generation, but feature selection (as in the paper) is a must otherwise it's learning too slow.
Inference for SLIM could be done with implicit framework ItemKNN inference code which is optimised already. And SLIM has sparse item-item similarity matrix under the hood so inference logic is the same.
Thanks for the review and comment. I don't have any preference for having the feature for BPR, so feel free to close this issue.
I briefly researched SLIM (specifically, SLIM ElasticNet) and found that many frameworks use this CPU-based implementation: https://github.com/MaurizioFD/RecSys2019_DeepLearning_Evaluation/blob/master/SLIM_ElasticNet/SLIMElasticNetRecommender.py, or some use a PyTorch-based one.
Anyway, let's continue to discuss on #103