pyFM icon indicating copy to clipboard operation
pyFM copied to clipboard

Adding user feature reduce model performance?

Open williamcao-01 opened this issue 7 years ago • 1 comments

Hi everyone,

I am trying to use ml-1m data to build a rs model for users. What is weird for me is that, the model has a better performance without using the user features. Did i do something wrong when adding the features or is this normal?

Fitting the dataset dataset = Dataset() dataset.fit(users = (row['UserID'] for index,row in users_df.iterrows()), items = (row['MovieID'] for index,row in movie_df.iterrows()), user_features = set(user_features_flat))

Creating the interaction and feature matrix (interactions, weights) = dataset.build_interactions((row['UserID'],row['MovieID'],row['rating']) for index,row in ratings_df.iterrows()) user_feature_matrix = dataset.build_user_features((row['UserID'], [row['Gender'],row['Occupation'],row['age_group']]) for index,row in users.iterrows())

Model with user features model = LightFM(no_components=70, loss='warp',) model.fit(interactions, user_features=user_feature_matrix, item_features=None, sample_weight=None, epochs=70, num_threads=4) p_k = evaluation.precision_at_k(model, test, k=10, user_features=user_feature_matrix, item_features=None, preserve_rows=False, num_threads=4, check_intersections=True).mean() p_k #0.14658715

Model without model_cf = LightFM(no_components=70, loss='warp') model_cf.fit(interactions, user_features=None, item_features=None, sample_weight=None, epochs=70, num_threads=4) p_k_cf = evaluation.precision_at_k(model_cf, test, k=10, user_features=None, item_features=None, preserve_rows=False, num_threads=4, check_intersections=True).mean() p_k_cf #0.1638668

williamcao-01 avatar Nov 16 '18 03:11 williamcao-01

same issue, bonus feature will not bring performance improve regard the epochs incre

fooSynaptic avatar Feb 17 '21 06:02 fooSynaptic