bedshift icon indicating copy to clipboard operation
bedshift copied to clipboard

Performance improvement

Open aaron-gu opened this issue 6 years ago • 2 comments

Python Pandas is slow for the large number of single-row operations in bedshift. It may be faster to read bedfiles into a native object like a list or dictionary and conduct operations on it.

aaron-gu avatar Dec 19 '19 18:12 aaron-gu

This has been mostly addressed with commit d6674fc084b0805aedd3337ef0bd7c1adfeab669. Cut and merge are the two slowest operations now, but can still complete reasonably quickly. I'll keep this issue open to see if we can move off of pandas in the future.

aaron-gu avatar May 11 '20 19:05 aaron-gu

Also see #11

aaron-gu avatar Mar 12 '21 04:03 aaron-gu