TopoBench icon indicating copy to clipboard operation
TopoBench copied to clipboard

Category: A1; Team name: TG; Datasets: [CDC-climate, US-county-fb]

Open grapentt opened this issue 2 months ago • 2 comments

Checklist

  • [x] My pull request has a clear and explanatory title.
  • [x] My pull request passes the Linting test.
  • [x] I added appropriate unit tests and I made sure the code passes all unit tests. (refer to comment below)
  • [ ] My PR follows PEP8 guidelines. (refer to comment below)
  • [x] My code is properly documented, using numpy docs conventions, and I made sure the documentation renders properly.
  • [ ] Configure graph loading in yaml file?

Description

Submission for TAG-DS TDL Challenge 2025

This PR adds two new graph-based regression datasets for node-level prediction:

  • CDC-Climate: U.S. county-level climate data (3,107 nodes, 6 features)
  • US-county-fb: U.S. county network with Facebook Social Connectedness Index (3,105 nodes, 9 features including social homophily measures)

Since both datasets are graphs (just like US-county-demos), I implemented a load_graph_with_features() method that accepts a GraphDatasetConfig specifying file paths, column names, separators, and optional preprocessing functions. I also refactored the USCountyDemosDataset.process() method to use this new unified approach. For backward compatibility, the original read_us_county_demos() function remains available and now internally calls load_graph_with_features() with the appropriate configuration.

grapentt avatar Nov 08 '25 21:11 grapentt

Dear Challenge Participant,

As we approach the final deadline, we kindly ask you to verify that all tests are passing on your submission. This will ensure that your contribution is valid and can be reviewed.

levtelyatnikov avatar Nov 17 '25 09:11 levtelyatnikov

Hi @grapentt ! Just marked your PR as ready for review--the submission seems to be completed. However, feel free to modify it further if you want to!

Thank you!

gbg141 avatar Nov 26 '25 02:11 gbg141