GFF3toolkit
GFF3toolkit copied to clipboard
Error in gff3_fix
Hi, After successfully using gff3_QC, gff3_fix is giving me the following error:
(genometools) [safiand@login001 grass]$ gff3_fix -qc_r test.txt -g turneri_annotation.gff3 -og new_corrected.gff3
INFO Checking QC report file (test.txt)...
INFO Checking GFF3 file (turneri_annotation.gff3)...
INFO Reading QC report file: (test.txt)...
INFO Reading GFF3 file: (turneri_annotation.gff3)...
Traceback (most recent call last):
File "/camp/home/safiand/home/users/safiand/.conda/envs/genometools/bin/gff3_fix", line 8, in <module>
sys.exit(script_main())
File "/camp/home/safiand/home/users/safiand/.conda/envs/genometools/lib/python3.10/site-packages/gff3tool/bin/gff3_fix.py", line 95, in script_main
gff3_fix.fix.main(gff3=gff3, output_gff=args.output_gff, error_dict=error_dict, line_num_dict=line_num_dict, logger=logger_stderr)
File "/camp/home/safiand/home/users/safiand/.conda/envs/genometools/lib/python3.10/site-packages/gff3tool/lib/gff3_fix/fix.py", line 692, in main
split(gff3=gff3, error_list=error_dict[error_code], logger=logger)
File "/camp/home/safiand/home/users/safiand/.conda/envs/genometools/lib/python3.10/site-packages/gff3tool/lib/gff3_fix/fix.py", line 165, in split
childrenlist.append(c1['attributes']['ID'])
KeyError: 'ID'
So I tried the gff3_ID_generator.py, but this one also give me a similar message:
(genometools) [safiand@login001 grass]$ python gff3_ID_generator.py -g turneri_annotation.gff3 -og new.gff3
INFO Reading input gff3 file: (turneri_annotation.gff3)
INFO Generate new ID for features in (turneri_annotation.gff3)
Traceback (most recent call last):
File "/camp/lab/cardoso-moreiam/home/users/safiand/genome_annotation/turneri/busco/turneri_rna_prot_multiples_species/grass/gff3_ID_generator.py", line 333, in <module>
main(in_gff=args.gff, merge_report=args.merge_report, out_merge_report=args.out_merge_report, out_gff=args.output_gff, uuid_on=args.universally_unique_identifier, prefix=arg
s.idprefix, digitlen=args.digitlen, report=args.report, alias=args.alias)
File "/camp/lab/cardoso-moreiam/home/users/safiand/genome_annotation/turneri/busco/turneri_rna_prot_multiples_species/grass/gff3_ID_generator.py", line 238, in main
ID_dict[child['attributes']['ID']] = [newcID]
KeyError: 'ID'
What can I do to solve this problem? Am I doing something wrong?
My gff3 file look like this:
(genometools) [safiand@login001 grass]$ head turneri_annotation.gff3 -n 20
# gffread augustus.hints.gtf -o turnerifiltered.gff3 --merge -L -g GCA_922788865.1_HVK001PTURNERI_genomic.shortID.fna
# gffread v0.11.6
##gff-version 3
CAKLNU010000942.1 gffcl locus 724 2835 . + . ID=RLOC_00000001;transcripts=jg1.t1
CAKLNU010000942.1 AUGUSTUS transcript 724 2835 . + . ID=jg1.t1;geneID=jg1;locus=RLOC_00000001
CAKLNU010000942.1 AUGUSTUS CDS 724 1083 . + 0 Parent=jg1.t1
CAKLNU010000942.1 AUGUSTUS CDS 1181 1625 0.34 + 0 Parent=jg1.t1
CAKLNU010000942.1 AUGUSTUS CDS 2270 2835 0.42 + 2 Parent=jg1.t1
CAKLNU010000422.1 gffcl locus 1528 9153 . + . ID=RLOC_00000002;transcripts=jg2.t1
CAKLNU010000422.1 AUGUSTUS transcript 1528 9153 . + . ID=jg2.t1;geneID=jg2;locus=RLOC_00000002
CAKLNU010000422.1 AUGUSTUS CDS 1528 1574 0.69 + 1 Parent=jg2.t1
CAKLNU010000422.1 AUGUSTUS CDS 1718 1788 0.68 + 2 Parent=jg2.t1
CAKLNU010000422.1 AUGUSTUS CDS 9010 9153 0.6 + 0 Parent=jg2.t1
CAKLNU010000746.1 gffcl locus 834 3644 . - . ID=RLOC_00000003;transcripts=jg3.t1
CAKLNU010000746.1 AUGUSTUS transcript 834 3644 . - . ID=jg3.t1;geneID=jg3;locus=RLOC_00000003
CAKLNU010000746.1 AUGUSTUS CDS 834 878 0.96 - 2 Parent=jg3.t1
CAKLNU010000746.1 AUGUSTUS CDS 988 1011 1 - 2 Parent=jg3.t1
CAKLNU010000746.1 AUGUSTUS CDS 1310 1336 1 - 2 Parent=jg3.t1
CAKLNU010000746.1 AUGUSTUS CDS 2483 2518 1 - 2 Parent=jg3.t1
CAKLNU010000746.1 AUGUSTUS CDS 2597 2695 1 - 2 Parent=jg3.t1
Thanks!
@DiegoSafian apologies, I completely missed this issue. Can you try removing the locus features from your gff3 file, to see if that is what the ID generator is erroring out on?