Auxiliary losses implementations
Hi Rex, I have a couple of questions regarding the implementation of the auxiliary losses.
- In the paper it says that 'at each layer l, we minimize
LLP = ||A(l), S(l)S(l)^T||_F, where || · ||_F
denotes the Frobenius norm.' However, in the code, what I find is:
self.link_loss = -adj * torch.log(pred_adj+eps) - (1-adj) * torch.log(1-pred_adj+eps)
which is the binary cross-entropy on pre_adj
Could you please explain why/how this is equivalent to the mathematical formulation? Also, I believe that the pre_adj used is created with the final assignation tensor, isn't it?
-
In theory you are also regularizing the entropy of the cluster assignment by minimizing
LE = 1/n Sum(H(Si))but I can't see this anywhere in the code? Could you point me to this please? -
A third comment, not related to the losses is that in the experiments section of the paper you say that you use GraphSAGE as a base for the model, but as far as I could see in the code, it is using a GConv. Could you also enlighten me a little bit on this please?
Thanks! Guadalupe
Hey @ggonzalezp, I'm interested in the auxiliary losses, too.
According to Issue #8, it didn't work so it's not in the codebase. But that's really confusing, because the paper describes the losses in detail in Section 3.3. @RexYing could you elaborate on this?
Hi, Guadalupe is right that it was cross entropy loss for link pred loss. The reason is that since the assignment prediction contains values between [0,1], cross entropy is more effective compared to l2 in Frobenius norm. Thanks for pointing out. I will update the arxiv pdf with a note.
I also added the entropy regularization. It makes the assignment matrix more discrete and improves interpretability, but does not necessarily improve classification.
Thanks for the clarification :)
Hi, Guadalupe is right that it was cross entropy loss for link pred loss. The reason is that since the assignment prediction contains values between [0,1], cross entropy is more effective compared to l2 in Frobenius norm. Thanks for pointing out. I will update the arxiv pdf with a note.
I also added the entropy regularization. It makes the assignment matrix more discrete and improves interpretability, but does not necessarily improve classification.
Hi Ying, I am recently working on the diffpool codes, and curious about the auxilary loss. Different from the paper, the git code has nll_loss and cross_entropy loss provided. So is there any difference between these two losses? In the paper, there are two auxiluary losses, so, finally only the nll_loss or cross_entropy loss is used? Looking forward to your reply. Thanks!!!
self.link_loss = F.nll_loss(torch.log(pred_adj), adj)
self.link_loss = -adj * torch.log(pred_adj+eps) - (1-adj) * torch.log(1-pred_adj+eps)
Hi, Guadalupe is right that it was cross entropy loss for link pred loss. The reason is that since the assignment prediction contains values between [0,1], cross entropy is more effective compared to l2 in Frobenius norm. Thanks for pointing out. I will update the arxiv pdf with a note.
I also added the entropy regularization. It makes the assignment matrix more discrete and improves interpretability, but does not necessarily improve classification.
@RexYing
Excuse me, could you please indicate that where is the entropy regularization in your code? I don't find where is it.