gradientgating icon indicating copy to clipboard operation
gradientgating copied to clipboard

Code Question

Open Hamsss opened this issue 1 year ago • 6 comments

First of all, thank you for your paper and codes, it really helps me a lot. Actually, I have a question about your code. This is the code of your model.

class G2_GNN(nn.Module):
    def __init__(self, nfeat, nhid, nclass, nlayers, conv_type='GCN', p=2., drop_in=0, drop=0, use_gg_conv=True):
        super(G2_GNN, self).__init__()
        self.conv_type = conv_type
        self.enc = nn.Linear(nfeat, nhid)
        self.dec = nn.Linear(nhid, nclass)
        self.drop_in = drop_in
        self.drop = drop
        self.nlayers = nlayers
        if conv_type == 'GCN':
            self.conv = GCNConv(nhid, nhid)
            if use_gg_conv == True:
                self.conv_gg = GCNConv(nhid, nhid)
        elif conv_type == 'GAT':
            self.conv = GATConv(nhid,nhid,heads=4,concat=True)
            if use_gg_conv == True:
                self.conv_gg = GATConv(nhid,nhid,heads=4,concat=True)
        else:
            print('specified graph conv not implemented')

        if use_gg_conv == True:
            self.G2 = G2(self.conv_gg,p,conv_type,activation=nn.ReLU())
        else:
            self.G2 = G2(self.conv,p,conv_type,activation=nn.ReLU())

    def forward(self, data):
        X = data.x
        n_nodes = X.size(0)
        edge_index = data.edge_index
        X = F.dropout(X, self.drop_in, training=self.training)
        X = torch.relu(self.enc(X))

        for i in range(self.nlayers):
            if self.conv_type == 'GAT':
                X_ = F.elu(self.conv(X, edge_index)).view(n_nodes, -1, 4).mean(dim=-1)
            else:
                X_ = torch.relu(self.conv(X, edge_index))
            tau = self.G2(X, edge_index)
            X = (1 - tau) * X + tau * X_
        X = F.dropout(X, self.drop, training=self.training)

        return self.dec(X)

I thought n-layers was the number of layers. But when I looked at the code, I realized that it means that one layer is used as nlayers. I think this means that whether I insert the number of layer such as 16 or 32, the number of layer is always one. May I ask why you implement the code like this? or Did I misunderstand?

And I also want to ask, if I want to check the model performance, what is the order of the Model, do I just pile the layer G2 up?

Hamsss avatar Mar 11 '24 14:03 Hamsss

Thanks for reaching out. It is a multi-layer GNN, however, in our case we share the same parameters among the different layers. That's why we do the for-loop over the number of layers, but calling the same GNN each time. The reason for that is:

  1. it gives the same and sometimes better performance than using different weights for each layer
  2. it corresponds to a graph-dynamical system modeled by a differential equation (please look at our paper for that)

I don't understand your second question, i.e., about checking the model performance.

tk-rusch avatar Mar 11 '24 14:03 tk-rusch

Thank you so much for your answer. I really appreciate it. I was only thinking of layers with different parameters.

And the second question mean is, I just want to run this code on the condition of different parameters among the different layers.

Hamsss avatar Mar 12 '24 02:03 Hamsss

@Hamsss Did you have any success in adapting it to non-shared parameters.

@tk-rusch Is this a requirement for the Gradient Gating framework to work? Sorry if this is a novice question.

Thanks you both in advance.

bjarkedc avatar Apr 12 '24 14:04 bjarkedc

No, it's absolutely not! You can extend it to using different parameters among different layers by simply:

self.convs = nn.ModuleList()
for i in range(nlayers):
  self.convs.append(GCNConv(nhid, nhid))

and then in forward():

for i in range(self.nlayers):
  X_ = torch.relu(self.convs[i](X, edge_index))

You can do the same for the GG layers.

tk-rusch avatar Apr 12 '24 15:04 tk-rusch

@tk-rusch Thank you for the swift reply and example.

bjarkedc avatar Apr 15 '24 06:04 bjarkedc

@tk-rusch Just to be clear, I can have multiple layers that do not share weights, and I can apply a gradient gate to each of these layers. Is that correctly understood? This does not break anything on the theoretical side, correct? Thank you in advance! Sorry for the spam.

bjarkedc avatar Jun 26 '24 14:06 bjarkedc