GFFtools-GX
GFFtools-GX copied to clipboard
gff-to-gtf off-by-one error?
I noticed some weird error when converting a GFF to GTF with gff_to_gtf.py.
This is the gene model in the original GFF:
scaffold_00034 JGI exon 1 472 . - . name "CLAGR_004651-RA"; transcriptId 5353
scaffold_00034 JGI CDS 1 472 . - 1 name "CLAGR_004651-RA"; proteinId 5305; exonNumber 3
scaffold_00034 JGI exon 527 1274 . - . name "CLAGR_004651-RA"; transcriptId 5353
scaffold_00034 JGI CDS 527 1274 . - 2 name "CLAGR_004651-RA"; proteinId 5305; exonNumber 2
scaffold_00034 JGI exon 1326 1593 . - . name "CLAGR_004651-RA"; transcriptId 5353
scaffold_00034 JGI CDS 1326 1593 . - 0 name "CLAGR_004651-RA"; proteinId 5305; exonNumber 1
scaffold_00034 JGI start_codon 1591 1593 . - 0 name "CLAGR_004651-RA"
The exon3 annotation goes from 1…472 on the - strand. After the conversion to GTF this is the result:
scaffold_00034 JGI exon 0 472 . - . gene_id "CLAGR_004651-RA"; transcript_id "5305"; exon_number "1"; gene_name "";
scaffold_00034 JGI CDS 1 472 . - 1 gene_id "CLAGR_004651-RA"; transcript_id "5305"; exon_number "1"; gene_name "";
scaffold_00034 JGI start_codon 1591 1593 . - 1 gene_id "CLAGR_004651-RA"; transcript_id "5305"; exon_number "1"; gene_name "";
scaffold_00034 JGI exon 527 1274 . - . gene_id "CLAGR_004651-RA"; transcript_id "5305"; exon_number "2"; gene_name "";
scaffold_00034 JGI CDS 527 1274 . - 2 gene_id "CLAGR_004651-RA"; transcript_id "5305"; exon_number "2"; gene_name "";
scaffold_00034 JGI exon 1326 1593 . - . gene_id "CLAGR_004651-RA"; transcript_id "5305"; exon_number "3"; gene_name "";
scaffold_00034 JGI CDS 1326 1593 . - 0 gene_id "CLAGR_004651-RA"; transcript_id "5305"; exon_number "3"; gene_name "";
scaffold_00034 JGI stop_codon 1 3 . - 0 gene_id "CLAGR_004651-RA"; transcript_id "5305"; exon_number "3"; gene_name "";
Somehow the exon now goes from 0…472 instead. I somehow assume it's a weird behavior if your feature borders to the end of the sequence?