Deep_reinforcement_learning_Course icon indicating copy to clipboard operation
Deep_reinforcement_learning_Course copied to clipboard

N-step returns

Open slerman12 opened this issue 6 years ago • 0 comments

Do these algorithms compute n-step returns for the reward propagation? The Sonic A2C code looks like it just does 1 step returns V(S) = R(S) + V(S_next), except it's hard to tell because I'm not too familiar with GAE.

slerman12 avatar Mar 09 '19 15:03 slerman12