MLDatasets.jl icon indicating copy to clipboard operation
MLDatasets.jl copied to clipboard

added the PPI dataset

Open scratch-er opened this issue 1 year ago • 1 comments

In this pull request, we add the PPI dataset from http://snap.stanford.edu/graphsage/#datasets

The PPI dataset is loaded in the same way as Reddit. We added a new file graphsage.jl, it provides the function load_graphsage_data. Most of the code in the original reddit.jl is copied into this function, and we added labels = stack(labels) to covert the labels of the PPI dataset from Vector{JSON3.Array} to a matrix.

We added ppi.jl, which defines the PPI dataset and loads data with this function. Now reddit.jl uses this function to load data as well.

scratch-er avatar Apr 29 '24 13:04 scratch-er

Hi, thanks for the contribution and sorry for getting to you so late. This is a welcome addition. You just need to add the new dataset to the documentation, docs/src/datasets/graph.md

CarloLucibello avatar Jun 27 '24 07:06 CarloLucibello