I got problems about the BlogCatalog dataset
I try to run the Runme.py, in which the BlogCatalog set is trained. But when I used the embedding for node classification, the performance was terrible. The micro f1 was around 0.2, Why?
Thanks for your interest. Did you make the "Indices" in your evaluation be consistent with the one in the embedding learning?
Thanks.
my label file follow the id of node. From 0 to... what the order does the Embedding.mat file follow in your source code?
CombG = G[Group1+Group2, :][:, Group1+Group2] The order in the Embedding.mat follows the "Group1+Group2".
It is for evaluation. Sorry for the confusion. I just directly release the code in my evaluation. I will update it when I get time.
Thanks.
Thank you very much.I have another question. In your source code, dose the all network data used to train rather than “remove the edges between train data and test dat” mentioned in your paper.
Yes. CombG = G[Group1+Group2, :][:, Group1+Group2]
We use the whole network to train Embedding.mat. After getting Embedding.mat, you could do cross validation on it.
Thanks.
I made the "Indices" in my evaluation be consistent with the one in the embedding learning, but the performance of Flickr dataset was lower than the paper. I just used the default parameters in your implementation. @xhuang31
How about the BlogCatalog. I use the SVM in Matlab to perform the classification in my papers.
As long as you use the same classifier, you will get similar results for AANE and baselines. They may become worser together, but relatively AANE would outperforms baselines in general.
Thanks.
I also used linear svm, i used 30% of BlogCatalog nodes to train classifier, and the micro-f1 is around 0.82. @xhuang31 Thank for your attention.
It is five-fold cross-validation. Should be 80% of data for training. Plz check the paper. Thanks.