async-rl icon indicating copy to clipboard operation
async-rl copied to clipboard

Why the value loss need to devide 2 in line 108 of a3c.py

Open onlytailei opened this issue 8 years ago • 1 comments

v_loss += (v - R) ** 2 / 2

But the original paper just calculate the derivative of the (V-R)^2 right?

onlytailei avatar Jul 25 '17 15:07 onlytailei

And you mentioned in https://github.com/muupan/async-rl/wiki They multiply the gradients of V by 0.5. So in the a3c.py there are the parameters (pi_loss_coef=1.0, v_loss_coef=0.5) But why there is another 0.5 in v_loss?

onlytailei avatar Aug 19 '17 20:08 onlytailei