Benchmarks available?
Been a while since I used Tpot in a project, and would like to try out the V2 version in a subsequent upcoming project. Any benchmarks available to date comparing TPOT2 Alpha to TPOT1? Or is it mainly just a refactoring of the codebase.
thank you for the interest! We are currently working on benchmarking and some optimization. Our preliminary results so far suggests that tpot2 performs similarly to tpot1 with similar parameters. however, this is something we are still in development and more testing needs to be done. Once we have more results, I will link to them here.
Speaking of benchmarks, are there any for the Dask parallelization promised in README.md, and might there even be a guide to using it?
TPOT2 parallelizes evaluations of pipelines with Dask under the hood. There is nothing that the user needs to do to use it. You can tell TPOT2 how many processes/cores to use with the n_jobs parameter.
Or, if you want to use your own custom Dask client, you can pass in your client instead.
Tutorial 7 covers Dask and TPOT2:
https://github.com/EpistasisLab/tpot2/blob/main/Tutorial/7_dask_parallelization.ipynb
We submitted a paper with some comparisons to TPOT1 to GPTP. I will link to it when it is published