policy-gradient-methods
policy-gradient-methods copied to clipboard
Modular PyTorch implementation of policy gradient methods
Results
4
policy-gradient-methods issues
Sort by
recently updated
recently updated
newest added
Using sampled entropy rather than analytic entropy.
According to fig 3 here https://arxiv.org/abs/1707.06347 VPG should be able to do ok after about 0.5M steps
Seems like discrete spaces with discrete actions is not doing well at all. Might be a wrapper problem