Code Question
First of all, thank you for your paper and codes, it really helps me a lot. Actually, I have a question about your code. This is the code of your model.
class G2_GNN(nn.Module):
def __init__(self, nfeat, nhid, nclass, nlayers, conv_type='GCN', p=2., drop_in=0, drop=0, use_gg_conv=True):
super(G2_GNN, self).__init__()
self.conv_type = conv_type
self.enc = nn.Linear(nfeat, nhid)
self.dec = nn.Linear(nhid, nclass)
self.drop_in = drop_in
self.drop = drop
self.nlayers = nlayers
if conv_type == 'GCN':
self.conv = GCNConv(nhid, nhid)
if use_gg_conv == True:
self.conv_gg = GCNConv(nhid, nhid)
elif conv_type == 'GAT':
self.conv = GATConv(nhid,nhid,heads=4,concat=True)
if use_gg_conv == True:
self.conv_gg = GATConv(nhid,nhid,heads=4,concat=True)
else:
print('specified graph conv not implemented')
if use_gg_conv == True:
self.G2 = G2(self.conv_gg,p,conv_type,activation=nn.ReLU())
else:
self.G2 = G2(self.conv,p,conv_type,activation=nn.ReLU())
def forward(self, data):
X = data.x
n_nodes = X.size(0)
edge_index = data.edge_index
X = F.dropout(X, self.drop_in, training=self.training)
X = torch.relu(self.enc(X))
for i in range(self.nlayers):
if self.conv_type == 'GAT':
X_ = F.elu(self.conv(X, edge_index)).view(n_nodes, -1, 4).mean(dim=-1)
else:
X_ = torch.relu(self.conv(X, edge_index))
tau = self.G2(X, edge_index)
X = (1 - tau) * X + tau * X_
X = F.dropout(X, self.drop, training=self.training)
return self.dec(X)
I thought n-layers was the number of layers. But when I looked at the code, I realized that it means that one layer is used as nlayers. I think this means that whether I insert the number of layer such as 16 or 32, the number of layer is always one. May I ask why you implement the code like this? or Did I misunderstand?
And I also want to ask, if I want to check the model performance, what is the order of the Model, do I just pile the layer G2 up?
Thanks for reaching out. It is a multi-layer GNN, however, in our case we share the same parameters among the different layers. That's why we do the for-loop over the number of layers, but calling the same GNN each time. The reason for that is:
- it gives the same and sometimes better performance than using different weights for each layer
- it corresponds to a graph-dynamical system modeled by a differential equation (please look at our paper for that)
I don't understand your second question, i.e., about checking the model performance.
Thank you so much for your answer. I really appreciate it. I was only thinking of layers with different parameters.
And the second question mean is, I just want to run this code on the condition of different parameters among the different layers.
@Hamsss Did you have any success in adapting it to non-shared parameters.
@tk-rusch Is this a requirement for the Gradient Gating framework to work? Sorry if this is a novice question.
Thanks you both in advance.
No, it's absolutely not! You can extend it to using different parameters among different layers by simply:
self.convs = nn.ModuleList()
for i in range(nlayers):
self.convs.append(GCNConv(nhid, nhid))
and then in forward():
for i in range(self.nlayers):
X_ = torch.relu(self.convs[i](X, edge_index))
You can do the same for the GG layers.
@tk-rusch Thank you for the swift reply and example.
@tk-rusch Just to be clear, I can have multiple layers that do not share weights, and I can apply a gradient gate to each of these layers. Is that correctly understood? This does not break anything on the theoretical side, correct? Thank you in advance! Sorry for the spam.