Category: A1; Team name: TG; Datasets: [CDC-climate, US-county-fb]
Checklist
- [x] My pull request has a clear and explanatory title.
- [x] My pull request passes the Linting test.
- [x] I added appropriate unit tests and I made sure the code passes all unit tests. (refer to comment below)
- [ ] My PR follows PEP8 guidelines. (refer to comment below)
- [x] My code is properly documented, using numpy docs conventions, and I made sure the documentation renders properly.
- [ ] Configure graph loading in yaml file?
Description
Submission for TAG-DS TDL Challenge 2025
This PR adds two new graph-based regression datasets for node-level prediction:
- CDC-Climate: U.S. county-level climate data (3,107 nodes, 6 features)
- US-county-fb: U.S. county network with Facebook Social Connectedness Index (3,105 nodes, 9 features including social homophily measures)
Since both datasets are graphs (just like US-county-demos), I implemented a load_graph_with_features() method that accepts a GraphDatasetConfig specifying file paths, column names, separators, and optional preprocessing functions. I also refactored the USCountyDemosDataset.process() method to use this new unified approach. For backward compatibility, the original read_us_county_demos() function remains available and now internally calls load_graph_with_features() with the appropriate configuration.
Dear Challenge Participant,
As we approach the final deadline, we kindly ask you to verify that all tests are passing on your submission. This will ensure that your contribution is valid and can be reviewed.
Hi @grapentt ! Just marked your PR as ready for review--the submission seems to be completed. However, feel free to modify it further if you want to!
Thank you!