Aashish Adhikari

Results 3 comments of Aashish Adhikari

I think each sample in a minibatch of losses should be multiplied by the corresponding IS weights. However, torch by default returns an averaged MSE loss over the minibatch. Hence,...

You are correct in saying that the priority value = absolute td error. However, I think the passed error in return (error + self.e) ** self.a ? is already an...

If anyone is still wondering why it pulls 0 from the replay memory, it is because the location in the replay memory that was sampled was not filled out yet...