Results 8 comments of garyongguanjie

Yes you have to seed them differently. Within each iteration at least.

Perhaps I should leave it up to the author to decide as he stated that he wanted `Asynchronous MCTS as described in the paper`. Although parallelizing episodes is much easier.

any preference for parallelization framework i prefer joblib but can probably use python multiprocessing as well

Interesting find i did not think of that! Looks like probably need to hold them in a buffer and when it reaches some bs then do the forward computation. Probably...

> I think the most important part is the batching of neural network input when predicting p and v. When i ran the othello training, i measured that ~85-90% of...

> I wouldn't mind taking a stab at providing a general implementation. I'm not sure though that I understand the higher level bit. From reading the code when MCTS performs...

I think parallelizing MCTS is much harder to implement. Depending on the number of simulations it may not be faster. But as said above batching inputs into GPU is the...

use (PackedSequence).data to return the tensor to be passed into the loss function.