ProphagePredictionComparisons icon indicating copy to clipboard operation
ProphagePredictionComparisons copied to clipboard

[Suggestion] Add strand information to the gbk files

Open apcamargo opened this issue 4 years ago • 1 comments

Gene strand can be very useful to detect prophages, but it is currently lacking from the .gb files. Because of that, there's no way to benchmark a tool that leverages strandness using proteins/ORFs extracted from this dataset's .gb files (using genbank2sequences.py, for example).

apcamargo avatar Jul 14 '21 03:07 apcamargo

Sorry for the late response here, I've only just had some time to revisit this project. The strand information is available in genbank files.

5' to 3' gene:

gene            9762..10592

3' to 5' gene:

gene            complement(9762..10592)

The programs that run from the genome FASTA-format files generally create their own annotations and will have strand info available (and if not it's their own fault).

beardymcjohnface avatar Dec 08 '21 04:12 beardymcjohnface