From pandas
Taking Matt's idea
Are there benchmarks in Pandas that are appropriate to take?
Here's a bunch from some of https://github.com/pandas-dev/pandas/tree/master/asv_bench/benchmarks
All of these at least run. I need to go through and shorten the longer running ones.
Whoa, that's a lot of benchmarks :) I suspect that more-is-better generally, especially at this stage. Any thoughts on if/how we should organize into high or low priority?
@TomAugspurger I have a work-in-progress PR for cleaning up those benchmarks: https://github.com/pandas-dev/pandas/pull/14099 I strongly recommend taking them from there or waiting until it is merged :-) (I am going to try to finish it this week). The original benchmarks were automatically generated based on the vbench ones and contain a huge amount of duplicate code and repetitive names.
I hope you didn't already do some of this clean-up yourself for this PR as that will probably have been duplicate work .. (or if so, I can maybe pick some things from here for my PR)
@jorisvandenbossche 👍 not sure how I missed that. Most of the stuff I did was pep8 formatting, removing the * import, and excluding benchmarks that don't make sense for dask. I'll hold of till you merge that and re-submit this.