Formula used for training loss
Hi Ben,
first many thanks for the library. It is an amazing piece of work! :-)
I have a quick question on the calculation of the training loss (for a case where every element in Ciu>=0, i.e. no negative feedback):
I believe the nominator is given by (Ciu * (Piu - scores)^2).sum().sum() + lambda * (frobenius-norm(item_factors) + frobenius-norm(user_factors))
where
- Ciu is the confidence matrix with zeros (so items with zero confidence vanish in the sum)
- Piu is 0/1 matrix indicating which item was watched
- scores = item_factors * user_factors.transpose
Then you divide by: (Ciu+1).sum().sum()
Is this true, because I cannot recover the training loss "manually" for a simple case? For instance, are you actually computing the Frobenius norm? Because I can not see where you take the root?
Hi Ben, to make things a bit more concrete, I use:
def compute_mse(ciu, item_factors, user_factors, regularization):
'''
Cui: confidence matrix with zeros (this is cui - 1) in the paper
item_factors: item factors
user_factors: user factors
regularization: lambda parameter
loss: MSE
object: the value of the objective function i.e. loss + regularization * (...)
'''
ciu_dense = ciu.toarray()
p = (ciu_dense > 0)
scores = (item_factors @ user_factors.transpose())
user_factor_norm = np.linalg.norm(user_factors, axis=1).sum()
item_factor_norm = np.linalg.norm(item_factors, axis=1).sum()
normalizer = ciu_dense.sum().sum() + ciu_dense.shape[0] * ciu_dense.shape[1] - p.sum().sum()
loss = ((1/normalizer) * (np.multiply(ciu_dense+1, ((p - scores)**2)).sum().sum()))
reg = (1/normalizer) * regularization * (item_factor_norm + user_factor_norm)
objective = loss + reg
return loss, objective
and I try to use it on
spcsr = sparse.csr_matrix([[1, 1, 0, 1, 0, 0],
[0, 1, 1, 1, 0, 0],
[1, 4, 1, 0, 7, 0],
[1, 1, 0, 0, 0, 0],
[9, 0, 4, 1, 0, 1],
[0, 1, 0, 0, 0, 1],
[0, 0, 2, 0, 1, 1]],
dtype=np.float64)
n_users = spcsr.shape[1]
n_items = spcsr.shape[0]
many thanks
Isn’t the ciu equal to one for non-occurrences? Because we multiply by alpha and then add 1. At least that’s how it is in the original publication. Not sure about how this library is implementing this, thiugh