diffpool Auxiliary losses implementations

Hi Rex, I have a couple of questions regarding the implementation of the auxiliary losses.

In the paper it says that 'at each layer l, we minimize

LLP = ||A(l), S(l)S(l)^T||_F,   where || · ||_F

denotes the Frobenius norm.' However, in the code, what I find is:

self.link_loss = -adj * torch.log(pred_adj+eps) - (1-adj) * torch.log(1-pred_adj+eps) which is the binary cross-entropy on pre_adj Could you please explain why/how this is equivalent to the mathematical formulation? Also, I believe that the pre_adj used is created with the final assignation tensor, isn't it?

In theory you are also regularizing the entropy of the cluster assignment by minimizing LE = 1/n Sum(H(Si)) but I can't see this anywhere in the code? Could you point me to this please?
A third comment, not related to the losses is that in the experiments section of the paper you say that you use GraphSAGE as a base for the model, but as far as I could see in the code, it is using a GConv. Could you also enlighten me a little bit on this please?

Thanks! Guadalupe

Jul 19 '19 11:07 ggonzalezp

Hey @ggonzalezp, I'm interested in the auxiliary losses, too.

According to Issue #8, it didn't work so it's not in the codebase. But that's really confusing, because the paper describes the losses in detail in Section 3.3. @RexYing could you elaborate on this?

Oct 28 '19 14:10 chaitjo

Hi, Guadalupe is right that it was cross entropy loss for link pred loss. The reason is that since the assignment prediction contains values between [0,1], cross entropy is more effective compared to l2 in Frobenius norm. Thanks for pointing out. I will update the arxiv pdf with a note.

I also added the entropy regularization. It makes the assignment matrix more discrete and improves interpretability, but does not necessarily improve classification.

Oct 28 '19 21:10 RexYing

Thanks for the clarification :)

Oct 29 '19 03:10 chaitjo

Hi, Guadalupe is right that it was cross entropy loss for link pred loss. The reason is that since the assignment prediction contains values between [0,1], cross entropy is more effective compared to l2 in Frobenius norm. Thanks for pointing out. I will update the arxiv pdf with a note.

I also added the entropy regularization. It makes the assignment matrix more discrete and improves interpretability, but does not necessarily improve classification.

Hi Ying, I am recently working on the diffpool codes, and curious about the auxilary loss. Different from the paper, the git code has nll_loss and cross_entropy loss provided. So is there any difference between these two losses? In the paper, there are two auxiluary losses, so, finally only the nll_loss or cross_entropy loss is used? Looking forward to your reply. Thanks!!!

        self.link_loss = F.nll_loss(torch.log(pred_adj), adj)
        self.link_loss = -adj * torch.log(pred_adj+eps) - (1-adj) * torch.log(1-pred_adj+eps)

Dec 06 '20 14:12 erinchen824

Hi, Guadalupe is right that it was cross entropy loss for link pred loss. The reason is that since the assignment prediction contains values between [0,1], cross entropy is more effective compared to l2 in Frobenius norm. Thanks for pointing out. I will update the arxiv pdf with a note.

I also added the entropy regularization. It makes the assignment matrix more discrete and improves interpretability, but does not necessarily improve classification.

@RexYing
Excuse me, could you please indicate that where is the entropy regularization in your code? I don't find where is it.

Jun 21 '21 14:06 b3326023