PytorchGeometricTutorial icon indicating copy to clipboard operation
PytorchGeometricTutorial copied to clipboard

Turtorial 1 - Cora dataset: unable to understand the masks, and how their counts add up

Open mriganktiwari opened this issue 2 years ago • 0 comments

Hi Antonio or anyone else,

I am trying to see what do the counts 140, 500, 1000 mean for train, val and test masks respectively. torch.sum(data.train_mask), torch.sum(data.val_mask), torch.sum(data.test_mask), data

Gives me this result:

(tensor(140, device='cuda:0'),
 tensor(500, device='cuda:0'),
 tensor(1000, device='cuda:0'),
 Data(x=[2708, 1433], edge_index=[2, 10556], y=[2708], train_mask=[2708], val_mask=[2708], test_mask=[2708]))

Question 1: Does this imply, for train only 140 nodes are available whereas for val and test a lot more as per the split in this data? I am coming from non-GNN background therefore this caught my eye. Question 2: The addition of 140+500+1000 = 1640, does not add upto 2708, should it not?

mriganktiwari avatar Apr 29 '23 10:04 mriganktiwari