report the results on all datasets
Results of node2vec, deewalk, line, sdne and struc2vec on all datasets. Hope this will help anyone who is interested in this project.
wiki
| Alg | micro | macro | samples | weighted | acc | NMI |
|---|---|---|---|---|---|---|
| node2vec | 0.7447 | 0.6771 | 0.7193 | 0.7450 | 0.6279 | 0.3536 |
| deepwalk | 0.7307 | 0.6579 | 0.7058 | 0.7296 | 0.6091 | 0.3416 |
| line | 0.5059 | 0.2461 | 0.4536 | 0.4523 | 0.3160 | 0.0798 |
| sdne | 0.6916 | 0.5119 | 0.6528 | 0.6718 | 0.5530 | 0.1801 |
| struc2vec | 0.4512 | 0.1249 | 0.3933 | 0.3383 | 0.2308 | 0.0516 |
brazil
| Alg | micro | macro | samples | weighted | acc | NMI |
|---|---|---|---|---|---|---|
| node2vec | 0.1481 | 0.1579 | 0.1481 | 0.1648 | 0.1481 | 0.0442 |
| deepwalk | 0.1852 | 0.1694 | 0.1852 | 0.2004 | 0.1852 | 0.0471 |
| line | 0.4444 | 0.4167 | 0.4444 | 0.4753 | 0.4444 | 0.2822 |
| sdne | 0.5926 | 0.5814 | 0.5926 | 0.5928 | 0.5926 | 0.4041 |
| struc2vec | 0.7778 | 0.7739 | 0.7778 | 0.7762 | 0.7778 | 0.3906 |
europe
| Alg | micro | macro | samples | weighted | acc | NMI |
|---|---|---|---|---|---|---|
| node2vec | 0.4125 | 0.4156 | 0.4125 | 0.4209 | 0.4125 | 0.0155 |
| deepwalk | 0.4375 | 0.4358 | 0.4375 | 0.4347 | 0.4375 | 0.0180 |
| line | 0.5000 | 0.4983 | 0.5000 | 0.5016 | 0.5000 | 0.1186 |
| sdne | 0.5000 | 0.4818 | 0.5000 | 0.4916 | 0.5000 | 0.1714 |
| struc2vec | 0.5375 | 0.5247 | 0.5375 | 0.5294 | 0.5375 | 0.0783 |
usa
| Alg | micro | macro | samples | weighted | acc | NMI |
|---|---|---|---|---|---|---|
| node2vec | 0.5420 | 0.5278 | 0.5420 | 0.5351 | 0.5420 | 0.0822 |
| deepwalk | 0.5504 | 0.5394 | 0.5504 | 0.5472 | 0.5504 | 0.0910 |
| line | 0.4160 | 0.4032 | 0.4160 | 0.4175 | 0.4160 | 0.1660 |
| sdne | 0.6092 | 0.5819 | 0.6092 | 0.5971 | 0.6092 | 0.2028 |
| struc2vec | 0.5210 | 0.5040 | 0.5210 | 0.5211 | 0.5210 | 0.0702 |
For wiki given by author, it is single_label, so what I got is micro=sample=acc . Or do you have a more complete data for wiki?
For wiki given by author, it is single_label, so what I got is micro=sample=acc . Or do you have a more complete data for wiki?
here is the document of parameter average of sklean.metirc.f1_score:
average : string, [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’] This parameter is required for multiclass/multilabel targets.
- 'micro': Calculate metrics globally by counting the total true positives, false negatives and false positives.
- 'macro': Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
- 'weighted': Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.
- 'samples': Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score).
So, I think it will get different results in a multiclass case.
@dawnranger That's good,I think you can open a pull request about the results on datasets and the codes to reproduce the results in a new folder.
@dawnranger 'samples': Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score). Wiki is multiclass rather than multilabel, isn‘t it? Why there is a difference between sample and acc? In addition, for flight data in your result, micro=sample=acc.
@dawnranger 'samples': Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score). Wiki is multiclass rather than multilabel, isn‘t it? Why there is a difference between sample and acc? In addition, for flight data in your result, micro=sample=acc.
I think you are right. I use shenweichen's code :
averages = ["micro", "macro", "samples", "weighted"]
results = {}
for average in averages:
results[average] = f1_score(Y, Y_, average=average)
results['acc'] = accuracy_score(Y,Y_)
and I got a warning with wiki dataset:
python3/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
As discussed in stackoverflow, the ill spliting of the train/test set might be blamed for this issue.
@dawnranger Yes. I found that the classify.py is similar to scoring.py in deepwalk which is provided by writer https://github.com/phanein/deepwalk/blob/master/example_graphs/scoring.py what I was confused is author did not provide the result and the origin of wiki. In addition, I tried data BlogCatalog(multi-lable) as the node2vec paper mentioned, and I set parameter as the paper did(d=128, r=10, l=80, k=10. training percent=50%, p=q=0.25), but I got a 0.12(MacroF1), far from the result which author provided(0.2581). So depressed...
hello, from these results, the accuracy does not seem to be high, what is the cause, is it a data problem?