ddpg icon indicating copy to clipboard operation
ddpg copied to clipboard

Reacher-v1 not training

Open amolchanov86 opened this issue 9 years ago • 4 comments

Hi, I have just tried running Reacher-v1 for 1000000 timesteps with default settings and it didn't learn anything (it just get stuck at -12 test reward), but it looks like you made it running with some settings, what were these settings ?

amolchanov86 avatar Dec 15 '16 18:12 amolchanov86

Hey,

sorry for the late reply! The most important setting which was reward normalization is actually hardcoded into filter_env.py for Reacher-v1. The other hyperparameters etc. should be fine. Have you tried multiple times? Are at least the two pendulum tasks working?

Cheers Simon

rmst avatar Dec 22 '16 03:12 rmst

Hi, thanks for the reply !

  • I tried only once. ok, I will rerun it. But the thing is I am experiencing the same problems with my implementation, although, all balancing envs and the hopper worked fine.
  • Another question: did you try to learn some high-dimensional tasks using ddpg?
  • And the last but not least: correct me if I am wrong, but you haven't tried prioritized experience replay, yet ? Because it is a bit confusing that PER is mentioned under "Improvements beyond the original paper", but from "replay_memory.py" it seems that replay buffer is just randomly sampled. Thanks a lot !

amolchanov86 avatar Dec 23 '16 01:12 amolchanov86

Hey, sry for the late reply.

I never got Reacher-v1 to "solve" but it was close (like you can see in the gif in the readme). For my evaluations I used the commit before "fixes in replay memory" but actually I don't believe the performance got worse after that commit. I don't use prioritized experience replay. The list of improvements are only a roadmap. I haven't had time to work on that so far and now it actually doesn't seem like such a big improvement compared to other things like auxiliary tasks in a3c and so on. Maybe I will release a new tensorflow deep RL repo though where we can include it.

Ah and no I didn't use it with convolutional nets on pixels yet. But that should also come soon (in the new repo though).

Cheers

rmst avatar Jan 06 '17 19:01 rmst

Hi thanks for the help !

amolchanov86 avatar Jan 08 '17 00:01 amolchanov86