Documentation for repository structure
Hi
Firstly, thanks for all the hard work on AlphaZero reproduction!
Is it possible to run C++ implementation locally (w/o Google Cloud cluster)?
I'm trying to run Minigo C++ locally mostly for learning purposes. After exploring the repo it's not completely clear to me what is the purpose of different folders, what is the interplay between Python/C++ implementations and what would be main entry point to train locally.
So far I figured (perhaps incorrectly):
-
cc- is this fully stand alone implementation? Can it be used to train a model by usingconcurrent_selfplayas an entry point? Does it talk to Python in any way? -
cluster- looks like Kubernetes stuff, can be ignored when running locally (?) -
ml_perf- the scriptstart_selfplay.shcalls C++concurrent_selfplay, buttrain.pycalls Python? I guess this is a MLPerf wrapper and is not required when running locally? -
rl_loop- looks like some kind of wrapper? -
.pyfiles inminigofolder - looks like python implementation? Is it fully independent, or does it talk to C++ in any way?
My current overall theory is that self-play (and test-games?) can be run either with C++ or Python, but training neural network can only be executed in python. Wrappers take care or switching between the two at the right times?
It seems I need to bootstrap, then self-play/train in a loop. The followup issue is that there are multiple bootstrap, selfplay and train scripts across the repository, some of them are wrappers around others, and it is non-obvious to me which folder contains the "master" training loop for local c++ end-to-end execution (if it is possible at all?).
Thanks in advance, will keep digging in the mean time.
It certainly is possible to run the Minigo pipeline locally, though I have personally never done so :)
Your understanding of the codebase is correct. The selfplay is all done in C++ by concurrent_selfplay (the python code still runs but is slower and has fewer features). Training is done in python.
It sounds like the ml_perf directory is what you need: it's a self-contained benchmark that trains a small model that learns to play something that looks like 19x19 Go within a day or two on a VM with 8 v100 GPUs and 96 cores. If you want to try something smaller to start with, you can train a 9x9 model much quicker (just change --board_size=19 to --board_size=9 in the instructions. You'll also have to bootstrap the training process using random games instead of the checkpoint that the benchmark instructions describe:
./ml_perf/scripts/bootstrap.sh \
--board_size=9 \
--base_dir=$BASE_DIR
Please let us know how you get on or if you have any questions, we'll be happy to help. Good luck!
So, just to confirm, if I run instructions from ml_pref/README.md, but replace:
# Download & extract bootstrap checkpoint.
gsutil cp gs://minigo-pub/ml_perf/0.7/checkpoint.tar.gz .
tar xfz checkpoint.tar.gz -C ml_perf/
# Download and freeze the target model.
mkdir -p ml_perf/target/
gsutil cp gs://minigo-pub/ml_perf/0.7/target.* ml_perf/target/
python3 freeze_graph.py --flagfile=ml_perf/flags/19/architecture.flags --model_path=ml_perf/target/target
# Set the benchmark output base directory.
BASE_DIR=$(pwd)/ml_perf/results/$(date +%Y-%m-%d-%H-%M)
# Bootstrap the training loop from the checkpoint.
# This step also builds the required C++ binaries.
# Bootstrapping is not considered part of the benchmark.
./ml_perf/scripts/init_from_checkpoint.sh \
--board_size=19 \
--base_dir=$BASE_DIR \
--checkpoint_dir=ml_perf/checkpoints/mlperf07
with (found in ml_pref/scripts/bootstrap.sh):
# Set the benchmark output base directory.
BASE_DIR=$(pwd)/ml_perf/results/$(date +%Y-%m-%d-%H-%M)
# Plays selfplay games using a random model in order to bootstrap the
# reinforcement learning training loop.
# Example usage:
./ml_perf/scripts/bootstrap.sh \
--board_size=19 \
--base_dir=$BASE_DIR
In theory I would be at least barking at the right tree?
Yep, that looks like the correct tree.
I do recommend trying on 9x9 before 19x19 first though, it's around 10x faster:
BASE_DIR=$(pwd)/ml_perf/results/$(date +%Y-%m-%d-%H-%M)
./ml_perf/scripts/bootstrap.sh \
--board_size=9 \
--base_dir=$BASE_DIR
You may also want to change ml_perf/scripts/start_selfplay.sh to have the selfplay binary write SGF files after each game completes for debugging purposes:
./bazel-bin/cc/concurrent_selfplay \
--sgf_dir="${sgf_dir}/selfplay/\$MODEL/${device}" \
etc...