graphvite Reproduce Table 4 in paper

Hi --

Can you point me to the code needed to reproduce the results in Table 4 of the paper. I ran

graphvite baseline deepwalk_youtube

which produced

----------- node classification ------------
effective labels: 50691 / 50767

macro-F1@1%: 0.310211
macro-F1@2%: 0.342265
macro-F1@3%: 0.352457
macro-F1@4%: 0.362396
macro-F1@5%: 0.367422
macro-F1@6%: 0.374697
macro-F1@7%: 0.376081
macro-F1@8%: 0.381525
macro-F1@9%: 0.381062
macro-F1@10%: 0.384292

micro-F1@1%: 0.379791
micro-F1@2%: 0.410207
micro-F1@3%: 0.422721
micro-F1@4%: 0.433862
micro-F1@5%: 0.441079
micro-F1@6%: 0.448772
micro-F1@7%: 0.451162
micro-F1@8%: 0.457318
micro-F1@9%: 0.45942
micro-F1@10%: 0.462941

Those results are similar but differ from the results in Table 4 by a few percentage points. Is the command above correct? Or is this kind of variation expected?

Thanks!

Nov 21 '19 17:11 bkj

Hi. It's expected.

The original paper is evaluated by liblinear while here it is evaluated by pytorch. Liblinear is optimized by second-order methods and its stop criterion is different from our pytorch implementation.

Nov 21 '19 18:11 KiddoZhu

OK great thanks. Are you able to give some more details about the experimental setup for those numbers? I have the following files for the youtube dataset:

~/.graphvite/dataset/youtube/
├── youtube_graph.txt
├── youtube-groupmemberships.txt
├── youtube-groupmemberships.txt.gz
├── youtube_label.txt
└── youtube-links.txt.gz

I'm guessing that Table 4 shows the results for predicting youtube_label.txt -- but that file only has 50767 entries for 31761 unique nodes, instead of |V| = 1,138,499 entries like I'd expect. Thoughts?

Thanks! ~ Ben

Nov 22 '19 22:11 bkj

If I had to guess w/o digging through your code (yet :) )--

I'm guessing that you convert youtube_label.txt to a (num_labeled_examples, num_labels) binary matrix, then use k% of the rows of the matrix to train the classifier than (100 - k)% to validate the classifier. In that case, 1% Labeled Nodes would indicate you used 31761 * 0.01 = 317 nodes to train the classifier in the first column of Table 4.

Is that right? Or am I misunderstanding something?

Nov 23 '19 00:11 bkj

Yes you're totally right. This setting is exactly inherited from DeepWalk and LINE.

youtube-groupmemberships.txt is the raw label file, which contains a huge number of communities. Since most communities are really small and noisy, only a few large communities are used. For Youtube, it's the top-47 communities according to the paper of DeepWalk. You can find the generation code here.

Personally I guess the original authors use such small evaluation data just because they are using liblinear on CPU.

Nov 23 '19 06:11 KiddoZhu