NCBI Sequin tbl format parser
Hello,
Do you have any method to parse the "NCBI Sequin tbl" format file:
This is an example of the format:
>Feature Chr_1
1 8618836 REFERENCE
CFMR 12345
1495 550 gene
locus_tag Tatro_000001
1495 1230 mRNA
1171 550
product hypothetical protein
transcript_id gnl|ncbi|Tatro_000001-T1_mrna
protein_id gnl|ncbi|Tatro_000001-T1
1495 1230 CDS
1171 550
codon_start 1
db_xref InterPro:IPR002410
db_xref PFAM:PF08386
db_xref InterPro:IPR000073
db_xref InterPro:IPR029058
db_xref InterPro:IPR013595
db_xref InterPro:IPR050266
db_xref PFAM:PF12697
note MEROPS:MER0025512
product hypothetical protein
transcript_id gnl|ncbi|Tatro_000001-T1_mrna
protein_id gnl|ncbi|Tatro_000001-T1
5108 3585 gene
locus_tag Tatro_000002
5108 4516 mRNA
4452 3585
product hypothetical protein
transcript_id gnl|ncbi|Tatro_000002-T1_mrna
protein_id gnl|ncbi|Tatro_000002-T1
4959 4516 CDS
4452 3781
codon_start 1
db_xref PFAM:PF00172
db_xref InterPro:IPR001138
db_xref InterPro:IPR036864
product hypothetical protein
transcript_id gnl|ncbi|Tatro_000002-T1_mrna
protein_id gnl|ncbi|Tatro_000002-T1
Regards
No, I don' think we have anything for this, although I did once look at the semi-related NCBI protein tables (*.ptt files) https://github.com/biopython/biopython/issues/1725
What is your use case (and can you use the GenBank format files instead)?
No, I don' think we have anything for this, although I did once look at the semi-related NCBI protein tables (*.ptt files) #1725
What is your use case (and can you use the GenBank format files instead)?
I need to parse and process this specific file format
I think it could be parsed into SeqRecord objects (with missing sequences - although we do know their lengths) and SeqFeature objects, allowing it to fit under Bio.SeqIO. A simpler parser might suffice for your needs?
(with missing sequences - although we do know their lengths)
Then you can create a Seq object with a defined length but undefined sequence contents.
Finally, I created my parser from scratch reading line by line.