alphagozero icon indicating copy to clipboard operation
alphagozero copied to clipboard

Huge number of files created

Open brianprichardson opened this issue 8 years ago • 3 comments

It ran for a couple of days and found several new best models. However, it also creates numerous files (502,586 items, totalling 5.6 GB). The models directory is large and the games directory has most of the files. Perhaps zipping would be worthwhile. In any case, I'm happy to restart it again after you have had a chance to make more improvements. Thanks again for sharing.

brianprichardson avatar Nov 26 '17 04:11 brianprichardson

Hmm yes it does create a bunch of files. There is a file for each move of every game of every model.

The interest is that the model only has to parse the directory of a model once (which is usually pretty fast) and can then open the files only once for each batch in training. During training the samples are taken randomly from any move of any game. Random access can be pretty slow pretty fast for huge data.

I could zip for past models as they are not used after some point (though Deepmind says they use the last 500k games which would correspond to 40M files in the current architecture). But it's not really my current focus as I feel there are still some optimizations that could be done.

Do you have any other idea to make it better ?

Narsil avatar Nov 27 '17 14:11 Narsil

Could it be put in a sqlite database?

tianshuo avatar Dec 26 '17 05:12 tianshuo

It could. But for now I won't do it as I feel a filesystem is the best as it can be quite easily split across machines (I'm pondering trying to use AWS to reach the infamous 0.4s/move claimed by alphago zero.)

Narsil avatar Dec 27 '17 06:12 Narsil