Dump complete population every generation so you can continue where you left off
-
We were doing a big run of Duke on 50,000 records (randomly sampled from 600,000), using population of 300, and asking 15 questions at each generation. (taking approx 3min 45sec per gen). After we reached 15 generations the program stopped due to a mysql database connection error.
-
lars
Having Duke dump the complete population every generation so you really can continue where you left off might actually be an idea.
-
We're willing to write a patch to do the complete population dump every generation. (So that we can resume long training runs that have been stopped in the middle).
-
lars
That's great. I should do it myself, but time is limited.
-
swami
Could you give me a brief outline of what we need to do, and which parts of the code we need to look at, so that we can do it in the right way.
-
lars
Start with the class no.priv.garshol.duke.genetic.Driver, and register a new option, something like --dumpstate=
. The value must be passed to GeneticAlgorithm in the same package. Then, at the bottom of GeneticAlgorithm.evolve(), if the option is set, call some new method that dumps the state.
Look for the section starting with the comment: // if asked to, write config
Add a new section that loops over the entire population instead of just saving the best one.
Then, you'll need another new option, maybe --restorestate=
, which instead of generating a new random population (the line population.create() in GeneticAlgorithm.run()) loads the existing one from that directory.