dask-benchmarks icon indicating copy to clipboard operation
dask-benchmarks copied to clipboard

From pandas

Open TomAugspurger opened this issue 9 years ago • 3 comments

Taking Matt's idea

Are there benchmarks in Pandas that are appropriate to take?

Here's a bunch from some of https://github.com/pandas-dev/pandas/tree/master/asv_bench/benchmarks

All of these at least run. I need to go through and shorten the longer running ones.

TomAugspurger avatar Nov 06 '16 21:11 TomAugspurger

Whoa, that's a lot of benchmarks :) I suspect that more-is-better generally, especially at this stage. Any thoughts on if/how we should organize into high or low priority?

mrocklin avatar Nov 06 '16 22:11 mrocklin

@TomAugspurger I have a work-in-progress PR for cleaning up those benchmarks: https://github.com/pandas-dev/pandas/pull/14099 I strongly recommend taking them from there or waiting until it is merged :-) (I am going to try to finish it this week). The original benchmarks were automatically generated based on the vbench ones and contain a huge amount of duplicate code and repetitive names.

I hope you didn't already do some of this clean-up yourself for this PR as that will probably have been duplicate work .. (or if so, I can maybe pick some things from here for my PR)

jorisvandenbossche avatar Nov 07 '16 09:11 jorisvandenbossche

@jorisvandenbossche 👍 not sure how I missed that. Most of the stuff I did was pep8 formatting, removing the * import, and excluding benchmarks that don't make sense for dask. I'll hold of till you merge that and re-submit this.

TomAugspurger avatar Nov 08 '16 01:11 TomAugspurger