DeepRobust icon indicating copy to clipboard operation
DeepRobust copied to clipboard

MetaApprox much higher accuracy for perturbed graph

Open cxw-droid opened this issue 4 years ago • 4 comments

Hi, Thanks for sharing the DeepRobust code.

I am testing the mettack.py code using test_mettack.py. When the model is A-Meta-Self, the attacked graph accuracy is much higher than Meta-Self model. For cora dataset with ptb_rate 0.2, the attacked graph accuracy with Meta-Self model is 0.4834 while the A-Meta-Self model output a much higher accuracy 0.7596.

A-Meta-Self model has similar problems for other ptb_rate. Does it need special parameter setting? What is your test results for A-Meta-Self?

cxw-droid avatar Jul 27 '21 19:07 cxw-droid

Hi, thanks for your interest in our repository!

I am not sure about the hyper-parameters of A-Meta-Self. In the paper, they only show the performance of A-Meta-Train and A-Meta-Both. I just tried A-Meta-Trainon Cora with 0.2 ptb_rate and we can obtain a lower accuracy 0.6725. In my view, the approximation of meta gradient is actually pretty aggressive because we discard the whole training trajectory of the inner problem.

ChandlerBang avatar Jul 27 '21 21:07 ChandlerBang

Thanks for your reply. I got a similar result for A-Meta_Train. Now I am just a little confused:

  1. Why the attack result of Meta-Self is much better than Meta-Train while the result of A-Meta-Self is much worse than A-Meta-Train?

  2. In line 492 of MetaApprox()::inner_train(), self.adj_grad_sum += torch.autograd.grad(attack_loss, self.adj_changes, retain_graph=True)[0], why do you take the derivative of attack_loss w.r.t. self.adj_changes instead of modified_adj? It seems the mettack paper takes the derivative w.r.t. the current adjacency matrix, which is like the modified_adj in your code. The result is different when using modified_adj.

cxw-droid avatar Jul 28 '21 23:07 cxw-droid

Hi,

  1. I am not very sure about the reason behind the phenomenon. It could be that Meta-Self involves more label information and when we approximate the training trajectory, A-Meta-Self simplifies too much to estimate the gradient direction.
  2. We followed the authors' tensorflow implementation. See here. It should be fine to directly calculate the gradient of A as it is the same as gradient of ΔA. But we have a symmetrization operation on ΔA before we perform ΔA+A. See https://github.com/DSE-MSU/DeepRobust/blob/2a52969fb8b881ac5325a8d0a26a6880aa8b6a9b/deeprobust/graph/global_attack/mettack.py#L70-L74

ChandlerBang avatar Jul 28 '21 23:07 ChandlerBang

Yes, theoretically the gradient w.r.t. A or ΔA should be the same.

As for the symmetrization code, I compared adj_changes_symm and self.adj_changes rightly before here and found no difference. I compared them with a condition sentence if (adj_changes_symm != self.adj_changes).sum().item() > 0:.

I also compared the test results before and after changing here to modified_adj = self.adj_changes + ori_adj for both model Self and model A-Self at ptb_rate 0.2. For model Self the results are slight different. For A-Self the results are exactly the same. If the gradient w.r.t. A or ΔA are the same, the test results should be exactly the same?

cxw-droid avatar Jul 29 '21 23:07 cxw-droid