Duke icon indicating copy to clipboard operation
Duke copied to clipboard

Dump complete population every generation so you can continue where you left off

Open swamikevala opened this issue 11 years ago • 1 comments

  • [email protected]

    We were doing a big run of Duke on 50,000 records (randomly sampled from 600,000), using population of 300, and asking 15 questions at each generation. (taking approx 3min 45sec per gen). After we reached 15 generations the program stopped due to a mysql database connection error.

  • lars

    Having Duke dump the complete population every generation so you really can continue where you left off might actually be an idea.

swamikevala avatar Sep 19 '14 08:09 swamikevala

  • [email protected]

    We're willing to write a patch to do the complete population dump every generation. (So that we can resume long training runs that have been stopped in the middle).

  • lars

    That's great. I should do it myself, but time is limited.

  • swami

    Could you give me a brief outline of what we need to do, and which parts of the code we need to look at, so that we can do it in the right way.

  • lars

    Start with the class no.priv.garshol.duke.genetic.Driver, and register a new option, something like --dumpstate=. The value must be passed to GeneticAlgorithm in the same package. Then, at the bottom of GeneticAlgorithm.evolve(), if the option is set, call some new method that dumps the state.

Look for the section starting with the comment: // if asked to, write config

Add a new section that loops over the entire population instead of just saving the best one.

Then, you'll need another new option, maybe --restorestate=

, which instead of generating a new random population (the line population.create() in GeneticAlgorithm.run()) loads the existing one from that directory.

swamikevala avatar Sep 25 '14 05:09 swamikevala