TopoBench icon indicating copy to clipboard operation
TopoBench copied to clipboard

Category: A1; Team name: GAAIMC; Dataset: FacebookPagePage

Open ixime opened this issue 6 months ago • 0 comments

Checklist

  • [x] My pull request has a clear and explanatory title.
  • [x] My pull request passes the Linting test.
  • [x] I added appropriate unit tests and I made sure the code passes all unit tests. (refer to comment below)
  • [x] My PR follows PEP8 guidelines. (refer to comment below)
  • [x] My code is properly documented, using numpy docs conventions, and I made sure the documentation renders properly.
  • [x] I linked to issues and PRs that are relevant to this PR.

Description

This pull request adds the FacebookPagePage dataset published in [1] for TAG-DS Topological Deep Learning Challenge 2025: Expanding the Data Landscape.

This webgraph is a page-page graph of verified Facebook sites. Nodes represent official Facebook pages while the links are mutual likes between sites. Node features are extracted from the site descriptions that the page owners created to summarize the purpose of the site. This graph was collected through the Facebook Graph API in November 2017 and restricted to pages from 4 categories which are defined by Facebook. These categories are: politicians, governmental organizations, television shows and companies. The task related to this dataset is multi-class node classification for the 4 site categories [2]

This dataset was shared in PyG [3], but the url to download it is broken, so we downloaded it from [2]. In [4] the features were truncated to a dimensionality of 128 using SVD. We added the dimensionality reduction as a data transformation and is performed as default for this dataset, however the complete data is kept, in case of choosing another kind of data transformation.

The same data transformation is used in PR's #216, #217 and #229

References:

[1] Rozemberczki, B., Allen, C. and Sarkar, R., 2021, Multi-Scale Attributed Node Embedding, IMA Journal of Complex Networks [2] SNAP: Network datasets: Facebook Page Page [3] Facebook_page_page in PyG [4] B. Rozemberczki and R. Sarkar. Characteristic Functions on Graphs: Birds of a Feather, from Statistical Descriptors to Parametric Models. 2020.

Issue

Additional context

ixime avatar Nov 05 '25 15:11 ixime