Frederic Go
Results
1
issues of
Frederic Go
I cannot get the result the same as the paper. When the training of jump policy, I always gets reward 0. The default values to learn runup policy are: algorithm.max_iterations:...