Cannot learn to jump.
I cannot get the result the same as the paper. When the training of jump policy, I always gets reward 0.
The default values to learn runup policy are:
algorithm.max_iterations: 2000 experiment.env: jumper_run2 env.jumper_run2.angular_v: [-3.0, -3.0, 1.0] env.jumper_run2.linear_v_z: -2.4
The jump policy with the following parameters, which are the recommended ones to learn Fosbury Flop.
algorithm.max_iterations: 12000 experiment.env: highjump # initial state file generated by the run-up training env.highjump.initial_state: results/runup-2022-Feb-10-175005/checkpoint_2000.tar.npy # wall orientation in degrees env.highjump.wall_rotation: -0.05 # must correspond to the training height of the checkpoint env.highjump.initial_wall_height: 0.5