cornac icon indicating copy to clipboard operation
cornac copied to clipboard

[ASK] API of NextBasketRecommender

Open buhrmann opened this issue 11 months ago • 4 comments

Hi, first of all thanks so much for the work you put in this project! I'm currently trying to use it for a quick improvement of baseline recommendations for a client of ours. I have a couple of simple questions related to the usage of NextBasketRecommenders.

Firstly, currently they don't seem to support the .recommend(user_id, k) API. This is because the Recommender class currently doesn't allow passing of extra **kwds, which would be necessary for subsequent calls of the rank(...) and score(...) methods to receive the history_baskets parameter. Is there any reason for not supporting the recommend() method? I was kind of expecting this as the main API for predicting the next basket.

Alternatively, I guess one can use the sort and rank methods directly. But from what I've seen in the next_basket_evaluation code, this seems to be very laborious:

item_rank, item_scores = model.rank(
            user_idx,
            item_indices,
            history_baskets=history_baskets,
            history_bids=bids[:-1],
            uir_tuple=test_set.uir_tuple,
            baskets=test_set.baskets,
            basket_indices=test_set.basket_indices,
            extra_data=test_set.extra_data,
        )

On this note, even the score() method requires manually passing the history_baskets parameter, and this is not documented. It's supposed to be a list of lists, but of product ids or original index values? For only the requested user or for all users? And in general, if the model was trained already with a dataset of baskets, why do I need to pass in something that's already in the dataset (or alternatively a separate test dataset)? Is there a simple way of extracting the required history_baskets from a dataset?

Maybe a simple example of a model fit -> predict/recommend scenario would be useful in the docs, in addition to the more complete Evaluation example.

buhrmann avatar Mar 04 '25 14:03 buhrmann

For example, would the following be correct to create recommendations from the training data for a specific user?

def user_baskets(ds, user_id) -> list[list]:
    """Extracts baskets as list[list] for user with given ID from dataset ds."""
    user_idx = ds.uid_map[user_id]
    _, item_indices, _ = ds.uir_tuple
    basket_indices = ds.user_basket_data[user_idx]
    for bidx in basket_indices:
        basket = ds.baskets[bidx]
        item_ids = item_indices[basket]
        yield list(item_ids)


ds = BasketDataset.build(data, fmt="UBITJson", seed=42)

model = TIFUKNN(
    n_neighbors=300,
    within_decay_rate=0.9,
    group_decay_rate=0.7,
    alpha=0.7,
    n_groups=7,
)

model.fit(ds)

user_id = "1104905"
baskets = list(user_baskets(ds, user_id))
scores = model.score(user_idx=None history_baskets=baskets)

(The user_idx is never used by the score method)

This, followed by selecting the top k indices in scores and translating those indices back to item IDs I assume?

buhrmann avatar Mar 04 '25 16:03 buhrmann

Hi @buhrmann,

Thank you for raising problem. The solution you mentioned above is valid.

lthoang avatar Mar 05 '25 00:03 lthoang

Thanks! And do you have plans to implement the recommend API for next basket models directly by any chance?

buhrmann avatar Mar 05 '25 11:03 buhrmann

@buhrmann Definitely. We also welcome contribution. Please join us.

lthoang avatar Mar 05 '25 13:03 lthoang