Category: A1; Team name: SPAICOM_CattDiN; Dataset: LargeScaleMultipurposeBenchmarkDatasetsWDN
Checklist
- [X] My pull request has a clear and explanatory title.
- [X] My pull request passes the Linting test.
- [X] I added appropriate unit tests and I made sure the code passes all unit tests. (refer to comment below)
- [X] My PR follows PEP8 guidelines. (refer to comment below)
- [X] My code is properly documented, using numpy docs conventions, and I made sure the documentation renders properly.
- [X] I linked to issues and PRs that are relevant to this PR.
Description
This pull requests integrates eleven datasets for Water Distribution Network (WDN) analysis as described in [1]: they are generated synthetically via numerical simulation given well-known configurations.
The eleven datasets are Anytown,Balerman,C-Town,D-Town,EXN,KY1,KY6,KY8,KY13,L-Town,Modena.
Each of these datasets comprise many different .csv files; however, we restricted the interest to the following files containing time-series generated as described in the reference paper:
| WDN | Domain |
|---|---|
pressure.csv |
Nodes |
demand.csv |
Nodes |
flowrate.csv |
Edges |
velocity.csv |
Edges |
head.csv |
Nodes |
head_loss.csv |
Edges |
friction_factor.csv |
Edges |
attrs.json |
- |
Each of these files includes a certain number of scenarios, each of which has a certain temporal resolution in terms of subsquent snapshots: the number of scenarios is stored under the key gen_batch_size in the attrs.json files, while the number of time-stamps is stored under the key duration. The metadata file attrs.json also contains the graph in terms of adjacency list under the key adj_list.
Water Distribution Networks (WDNs) can be naturally represented as graphs, which has led to extensive use of graph deep learning for a variety of challenging tasks [2]. These applications span both transductive settings-such as estimating the full network state from partial observations at a single time snapshot-and inductive settings, where models leverage spatiotemporal structure to perform tasks like demand forecasting. Clearly, the datasets poses many possible regression problems that can be cast at node-level, edge-level, and at a combined level.
However, the physical parameters of WDNs are governed by a variety of structural laws that impose higher-order topological constraints on the system [3]. Thus, we believe that topological deep learning could offer a powerful and principled framework for addressing the engineering problems posed by WDN monitoring and analysis.
Issue
We know that spatio-temporal and cross-domain learning is beyond the scope of the current implemented architectures of TopoBench; this in our opinion does not make such a contribution any less relevant, but rather makes it crucial in view of the implementation of new topological models for time series and for physical-informed topological learning. Moreover, it serves as the first bridge between the topological deep learning and water distribution network (WDN) communities, establishing TopoBench as a practical tool for this type of analysis.
References
[1] Tello A., et al., "Large-Scale Multipurpose Benchmark Datasets For Assessing Data-Driven Deep Learning Approaches For Water Distribution Networks" (2023)
[2] Vittori, G., et al. “Graph neural networks to model and optimize the operation of Water Distribution Networks: A review.” (2025)
[3] Cattai, T, et al. “Physics-Informed Topological Signal Processing for Water Distribution Network Monitoring” (2025)
Co-authored by @TizianaCattai and @LeoDiNino97
Dear Participants,
This is a final reminder regarding the upcoming challenge deadline.
📅 Deadline: Tomorrow, 25th November 2025
✅ Critical Requirement: Please ensure your branch is passing all CI/CD tests.
If you have any pending changes, please push them and verify your build status as soon as possible.
Good luck!