graphnet
graphnet copied to clipboard
Collect open dataset(s) for model development and benchmarking
Description
Open datasets would enable benchmarking and foster reproducibility. Options include existing open-source datasets e.g. PROMETHEUS and Kaggle datasets. One might include datasets that are similar in form to other physics experiments, such as jet tagging, etc.
Acceptance Criteria
- [ ] Identify candidate datasets
- [ ] Convert dataset to supported file format(s)
- [ ] Provide open dataset with clear documentation or reference to existing documentation
The NPML seem to be creating a data set challenge that will be ongoing for 6 months at a time before being replaced with a new dataset. These might be interesting for bench-marking and could also provide an opportunity to show off the capabilities of GraphNeT. https://indico.ipmu.jp/event/462/page/1500-data-challenges-and-olympics