Cinderella1001
Cinderella1001
> it seems the attention matrix A is kept same in all propagation layers (within each epoch), I am wondering shouldn't we calculate attention matrix for each layer separately based...
The attention score is actually fixed during one epoch training.But attention score in next epoch(also fixed) is different from attention score in this epoch. Maybe, Updating attention score in different...
Could I ask that what your average BPR loss and transR loss are in epoch 0? sum of BPR loss in all batchs/number of batchs.The magnitude of my loss is...
也就是我不太懂gama^(k+1)(a1)、gama^(k+1)(a2)【全局图上的attention】和alpha^(k+1)(ab)、beta^(k+1)(ai)【局部图上的attention】计算方式的不同,为什么gama^(k+1)(a1)、gama^(k+1)(a2)将两个embedding全连接之后输入MLP计算attention,而alpha^(k+1)(ab)、beta^(k+1)(ai)分别使用只用户embedding、物品embedding输入MLP计算attention。
二者的区别是node attention 与 graph attention的区别吗?因为用户是处于社交网络和兴趣网络上的,所以需要一个graph attention ,而物品只处于兴趣网络上,因此只需要node attention。 我想问,可以去掉user的node attention,只使用graph attention吗?也就是去掉中间变量p和q的计算过程。您考虑使用node attention,是因为实验效果更好吗?
I also met this problem, is there any solution?
同问,确实和作者 日志中记录的一样,每个epoch花费900s左右,那所有epoch跑完得需要5天啊!有没有提速的方法呢?