TopoBench icon indicating copy to clipboard operation
TopoBench copied to clipboard

Category: A1; Team name: LangDiff; Dataset: Twitch

Open Mullerio opened this issue 2 months ago • 0 comments

Checklist

  • [x] My pull request has a clear and explanatory title.
  • [x] My pull request passes the Linting test.
  • [x] I added appropriate unit tests and I made sure the code passes all unit tests. (refer to comment below)
  • [x] My PR follows PEP8 guidelines. (refer to comment below)
  • [x] My code is properly documented, using numpy docs conventions, and I made sure the documentation renders properly.
  • [x] I linked to issues and PRs that are relevant to this PR.

Description

Pull request for Twitch Dataset [1] implementation.

The Twitch Dataset consists of multiple social network graphs for streamers speaking different languages on the Streaming Platform Twitch. Each node is a Streamer and Edges correspond to followership between them. Feature embeddings represent the games played. The classification task is whether or not a user is streaming mature content based on the games played.

[1] Benedek Rozemberczki, Carl Allen, & Rik Sarkar. (2021). Multi-scale Attributed Node Embedding.

Relevant PRs from PyTorch Geometric

The Dataset is present in PyTorch Geometric, but currently broken pyg-team/pytorch_geometric#10510 hence implemented fully here.

There also is a relevant PR pyg-team/pytorch_geometric#10415 which I think does not fully fix the issue.

Additional context

Submission by Jonas Müller of Team LangDiff

Mullerio avatar Nov 07 '25 18:11 Mullerio