jzhou316

Results 25 comments of jzhou316

Hi thanks for the question! Currently it is normalized by the source node degree when `self.deg_norm == 'rw'`. Can you explain a bit about "but actually we have to do...

Hi I think based on your description, we first normalize the feature for each node based on it's outgoing degree, and then aggregate the features of different source nodes into...

the line `x_j = torch.index_select(x_j, 0, edge_index[0])` just duplicates the node features into each edge, which is sizing the source feature matrix (N x C) into a edge feature matrix...

Hi, we have detailed how the large graph datasets are stored in a unified hdf5 graph data format we use [here](https://github.com/harvardnlp/botnet-detection/blob/master/graph_data_storage.md). The API format of pyg, dgl, nx, or dict...

The `xx_split_idx.pkl` stores indexes of how to split the original large graph dataset in the HDF5 format into train/validation/test sets. It is a dictionary with keys `"train"`, `"val"`, and `"test"`,...

Yes `x` stores the node features. As our graphs are featureless, we do not have them in the raw data. However, the GNN algorithms need some values to operate with,...

cool. I'll add some details

@velpc These are dataset statistics stored in the HDF5 file (and may not be used by the model). For different specific problems such as multiclassification, you can write your own...

@helmoai Sorry that we currently don't have an official mini dataset for quick testing. Could you download the data and take out a subset (e.g. a few graphs) to run...

@helmoai yes you are right. Thanks for pointing it out! Updated it.