bedshift
bedshift copied to clipboard
Bedfile perturbation tool
I noticed using `bedshift` recently that the algorithm seems to produce regions that are *invalid*. Specifically, it shifted my bedfile to create a region with a start that occurred after...
[The requirements file](https://github.com/databio/bedshift/blob/c12e549793d298f08c96d3ef71b7fce6331db0e1/requirements/requirements-all.txt#L2) doesn't specify a specific pandas version. Therefore, it will grab `pandas>2.0`. In pandas 2.0, the `df.append` method is no longer used. This is occurring in a few...
The performance of shift is really slow. I think it can be improved if regions are not modified in place, but are added as new regions and old regions are...
Python Pandas is slow for the large number of single-row operations in bedshift. It may be faster to read bedfiles into a native object like a list or dictionary and...
`bedshift` has been fully integrated into [geniml](https://github.com/databio/geniml_dev/tree/master/geniml/bedshift), including [CLI](https://github.com/databio/geniml_dev/blob/3675f981f207d2d37eccc28d406d3ce65da5d747/geniml/cli.py#L464-L575), [pytests](https://github.com/databio/geniml_dev/blob/master/tests/test_bedshift.py), and [doc](https://github.com/databio/bedbase/blob/master/docs/geniml/tutorials/bedshift.md). This repository should be safe to deprecate. **I do not have repository options access**.