Apollo issues

Results 7 issues of


                                            Apollo

Is it multiple searching at the same time?

https://github.com/mokemokechicken/reversi-alpha-zero/blob/5ee2f330663b34513f0c894eb658f03a1201f400/src/reversi_zero/agent/player.py#L115-L121 I first think this code is searching in the simulation_num_per_move threads at the same time. But I see async function is not called in the multi thread. How about...

It may forget pertinent information about positions that it no longer visits.

I see my model don't be improved anymore. Moreover I found "It may forget pertinent information about positions that it no longer visits" as [ThomasWAnthony's](https://www.reddit.com/r/MachineLearning/comments/76xjb5/ama_we_are_david_silver_and_julian_schrittwieser/dolnq31/) when opinion select action unusually....

Policy out softmax with illegal moves

https://github.com/mokemokechicken/reversi-alpha-zero/blob/5ee2f330663b34513f0c894eb658f03a1201f400/src/reversi_zero/agent/model.py#L48 I see calculate policy softmax on the all moves contains illegal. How can calculate softmax on the only legal moves, if set placeholder for legal moves?

About MCTS

https://github.com/mokemokechicken/reversi-alpha-zero/blob/f1cfa6c7177ec5f76a89e20fd97eb4c5d678611d/src/reversi_zero/agent/player.py#L165-L168 I see update N and W with virtual loss when select the node in order to discourages other threads from simultaneously exploring the identical variation (in paper). 1. Why...

How can I get train dataset?

Thanks for sharing your code. I'll train the model myself with tensorflow. How can I get the train image dataset?

How can I train it self?

I can't found single_play.py. How can I start to train it self-mode? And had you trained with alphago zero method and how about result? Thanks.

Multiprocess and Model API Pipe

https://github.com/Akababa/Chess-Zero/blob/90a5aad05656131506239388557b9f60d16235a3/src/chess_zero/worker/self_play.py#L33-L41 I see you create max_processes pipe list of search_threads pipes and max_processes execute self play processes with the pipe list. A self player use own pipe list by pop...