Flat A3C agent

Open dai-dao opened this issue 8 years ago • 2 comments

Hi,

I really like your work, and want to ask for some clarifications on your new observation on training a flat A3C agent without the meta-controller. In this case are the sub-goals randomly generated every 'c' timesteps? (instead of the meta-controller outputting the sub-goal)

Thanks, Dai

Dec 24 '17 23:12 dai-dao

Hi Dai,

Yes, the sub-goals were randomly generated every c=100 time-steps. I also found that fixing the sub-goal to be just the first one also works in some seed. This only works with feature-control pseudo reward tho.

Best, Nat

Dec 25 '17 03:12 Nat-D

Does "randomly generated meta-action" work with pixel-control pseudo reward?

Dec 30 '17 11:12 nina124