medperf
medperf copied to clipboard
FL support
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅
working checkpoint: 20ec858ea57afcc1e183c69e3d01b65ec9b4e9bc
TODO
- [ ] stream logs
- [ ] check benchmark execution mlcube training exp ID (what is this? I forgot)
- [ ] double check if we need atomic transactions on the server
- [x] ~~We now have demo data url and hash in training exp (dummy) that we don't use.~~
- [ ] test a common scenario: model owner and agg owner being the same
- [x] ~~collaborators doesn't use tensorboard logs.~~ (related to logging, first point)
- [x] ~~finalize endpoints for new entities? (delete?, put?, ...)~~
- [ ] have timeouts in integration tests (3mins should be enough)
- [x] docker publish port only to a specified network interface
- [x] ~~limit mlcube network access? (for now we can rely on the review of the experiment owner)~~
- [ ] server permission: they are broken!!
- [x] plan being confidential (related to permissions
- [x] running datasets before aggs??
- [x] mounting files: be careful!!! check parent folders (issue will be created)
- [ ] 404 for exp.aggregator,ca,event: not informative (check common practice for such a thing)
- [ ] writing unit tests
- [ ] finalize CA issues/deployment