Some questions about the backbone model
I noticed that the backbone model used for the regularization based GNNs is GCN. While GRAND seems to use a mixed-order propagation backbone. Is this a fair comparison? I wonder if GRAND benefits a lot from the large receptive field.
GRAND adopts various techniques to promote the performance on this task. Mixed-order propagation is one of the components of GRAND, which can reduce over-smoothing for two reasons: 1) focusing more on local information, 2) removing non-linear transformation between layers. Employing this propagation to perform random data augmentation is also a contribution of this work. And the results of other regularization methods are directly taken from their original paper for convenience. I think combine other regularization methods (e.g. mixup) with this propagation rule is a good research direction, you can have a try if you have interests :).