N-step returns

Open slerman12 opened this issue 6 years ago • 0 comments

Do these algorithms compute n-step returns for the reward propagation? The Sonic A2C code looks like it just does 1 step returns V(S) = R(S) + V(S_next), except it's hard to tell because I'm not too familiar with GAE.

Mar 09 '19 15:03 slerman12