maro
maro copied to clipboard
Add Learning What to Defer algorithm for MIS problem example
Description
Contexts include:
- A simple MIS simulator, which simulates a complete MIS-solving process as a transition;
- Graph-based PPO algorithm implementation, including Policy, TranOps, Trainer, basic networks, and replay memory;
- An EnvSampler and necessary configuration files;
- A README of the brief introduction of this algorithm.
Linked issue(s)/Pull request(s)
- issue_number
Type of Change
- [ ] Non-breaking bug fix
- [ ] Breaking bug fix
- [ ] New feature
- [ ] Test
- [ ] Doc update
- [ ] Docker update
Related Component
- [ ] Simulation toolkit
- [ ] RL toolkit
- [ ] Distributed toolkit
Has Been Tested
- OS:
- [ ] Windows
- [ ] Mac OS
- [ ] Linux
- Python version:
- [ ] 3.7
- [ ] 3.8
- [ ] 3.9
- Key information snapshot(s):
Needs Follow Up Actions
- [ ] New release package
- [ ] New docker image
Checklist
- [ ] Add/update the related comments
- [ ] Add/update the related tests
- [ ] Add/update the related documentations
- [ ] Update the dependent downstream modules usage