[ASK] API of NextBasketRecommender
Hi, first of all thanks so much for the work you put in this project! I'm currently trying to use it for a quick improvement of baseline recommendations for a client of ours. I have a couple of simple questions related to the usage of NextBasketRecommenders.
Firstly, currently they don't seem to support the .recommend(user_id, k) API. This is because the Recommender class currently doesn't allow passing of extra **kwds, which would be necessary for subsequent calls of the rank(...) and score(...) methods to receive the history_baskets parameter. Is there any reason for not supporting the recommend() method? I was kind of expecting this as the main API for predicting the next basket.
Alternatively, I guess one can use the sort and rank methods directly. But from what I've seen in the next_basket_evaluation code, this seems to be very laborious:
item_rank, item_scores = model.rank(
user_idx,
item_indices,
history_baskets=history_baskets,
history_bids=bids[:-1],
uir_tuple=test_set.uir_tuple,
baskets=test_set.baskets,
basket_indices=test_set.basket_indices,
extra_data=test_set.extra_data,
)
On this note, even the score() method requires manually passing the history_baskets parameter, and this is not documented. It's supposed to be a list of lists, but of product ids or original index values? For only the requested user or for all users? And in general, if the model was trained already with a dataset of baskets, why do I need to pass in something that's already in the dataset (or alternatively a separate test dataset)? Is there a simple way of extracting the required history_baskets from a dataset?
Maybe a simple example of a model fit -> predict/recommend scenario would be useful in the docs, in addition to the more complete Evaluation example.
For example, would the following be correct to create recommendations from the training data for a specific user?
def user_baskets(ds, user_id) -> list[list]:
"""Extracts baskets as list[list] for user with given ID from dataset ds."""
user_idx = ds.uid_map[user_id]
_, item_indices, _ = ds.uir_tuple
basket_indices = ds.user_basket_data[user_idx]
for bidx in basket_indices:
basket = ds.baskets[bidx]
item_ids = item_indices[basket]
yield list(item_ids)
ds = BasketDataset.build(data, fmt="UBITJson", seed=42)
model = TIFUKNN(
n_neighbors=300,
within_decay_rate=0.9,
group_decay_rate=0.7,
alpha=0.7,
n_groups=7,
)
model.fit(ds)
user_id = "1104905"
baskets = list(user_baskets(ds, user_id))
scores = model.score(user_idx=None history_baskets=baskets)
(The user_idx is never used by the score method)
This, followed by selecting the top k indices in scores and translating those indices back to item IDs I assume?
Thanks! And do you have plans to implement the recommend API for next basket models directly by any chance?
@buhrmann Definitely. We also welcome contribution. Please join us.