GFF3toolkit icon indicating copy to clipboard operation
GFF3toolkit copied to clipboard

Error in gff3_fix

Open DiegoSafian opened this issue 3 years ago • 1 comments

Hi, After successfully using gff3_QC, gff3_fix is giving me the following error:

(genometools) [safiand@login001 grass]$ gff3_fix -qc_r test.txt -g turneri_annotation.gff3 -og new_corrected.gff3
INFO     Checking QC report file (test.txt)...
INFO     Checking GFF3 file (turneri_annotation.gff3)...
INFO     Reading QC report file: (test.txt)...
INFO     Reading GFF3 file: (turneri_annotation.gff3)...
Traceback (most recent call last):
  File "/camp/home/safiand/home/users/safiand/.conda/envs/genometools/bin/gff3_fix", line 8, in <module>
    sys.exit(script_main())
  File "/camp/home/safiand/home/users/safiand/.conda/envs/genometools/lib/python3.10/site-packages/gff3tool/bin/gff3_fix.py", line 95, in script_main
    gff3_fix.fix.main(gff3=gff3, output_gff=args.output_gff, error_dict=error_dict, line_num_dict=line_num_dict, logger=logger_stderr)
  File "/camp/home/safiand/home/users/safiand/.conda/envs/genometools/lib/python3.10/site-packages/gff3tool/lib/gff3_fix/fix.py", line 692, in main
    split(gff3=gff3, error_list=error_dict[error_code], logger=logger)
  File "/camp/home/safiand/home/users/safiand/.conda/envs/genometools/lib/python3.10/site-packages/gff3tool/lib/gff3_fix/fix.py", line 165, in split
    childrenlist.append(c1['attributes']['ID'])
KeyError: 'ID'

So I tried the gff3_ID_generator.py, but this one also give me a similar message:

(genometools) [safiand@login001 grass]$ python gff3_ID_generator.py -g turneri_annotation.gff3 -og new.gff3
INFO     Reading input gff3 file: (turneri_annotation.gff3)
INFO     Generate new ID for features in (turneri_annotation.gff3)
Traceback (most recent call last):
  File "/camp/lab/cardoso-moreiam/home/users/safiand/genome_annotation/turneri/busco/turneri_rna_prot_multiples_species/grass/gff3_ID_generator.py", line 333, in <module>
    main(in_gff=args.gff, merge_report=args.merge_report, out_merge_report=args.out_merge_report, out_gff=args.output_gff, uuid_on=args.universally_unique_identifier, prefix=arg
s.idprefix, digitlen=args.digitlen, report=args.report, alias=args.alias)
  File "/camp/lab/cardoso-moreiam/home/users/safiand/genome_annotation/turneri/busco/turneri_rna_prot_multiples_species/grass/gff3_ID_generator.py", line 238, in main
    ID_dict[child['attributes']['ID']] = [newcID]
KeyError: 'ID'

What can I do to solve this problem? Am I doing something wrong?

My gff3 file look like this:

(genometools) [safiand@login001 grass]$ head turneri_annotation.gff3 -n 20
# gffread augustus.hints.gtf -o turnerifiltered.gff3 --merge -L -g GCA_922788865.1_HVK001PTURNERI_genomic.shortID.fna
# gffread v0.11.6
##gff-version 3
CAKLNU010000942.1       gffcl   locus   724     2835    .       +       .       ID=RLOC_00000001;transcripts=jg1.t1
CAKLNU010000942.1       AUGUSTUS        transcript      724     2835    .       +       .       ID=jg1.t1;geneID=jg1;locus=RLOC_00000001
CAKLNU010000942.1       AUGUSTUS        CDS     724     1083    .       +       0       Parent=jg1.t1
CAKLNU010000942.1       AUGUSTUS        CDS     1181    1625    0.34    +       0       Parent=jg1.t1
CAKLNU010000942.1       AUGUSTUS        CDS     2270    2835    0.42    +       2       Parent=jg1.t1
CAKLNU010000422.1       gffcl   locus   1528    9153    .       +       .       ID=RLOC_00000002;transcripts=jg2.t1
CAKLNU010000422.1       AUGUSTUS        transcript      1528    9153    .       +       .       ID=jg2.t1;geneID=jg2;locus=RLOC_00000002
CAKLNU010000422.1       AUGUSTUS        CDS     1528    1574    0.69    +       1       Parent=jg2.t1
CAKLNU010000422.1       AUGUSTUS        CDS     1718    1788    0.68    +       2       Parent=jg2.t1
CAKLNU010000422.1       AUGUSTUS        CDS     9010    9153    0.6     +       0       Parent=jg2.t1
CAKLNU010000746.1       gffcl   locus   834     3644    .       -       .       ID=RLOC_00000003;transcripts=jg3.t1
CAKLNU010000746.1       AUGUSTUS        transcript      834     3644    .       -       .       ID=jg3.t1;geneID=jg3;locus=RLOC_00000003
CAKLNU010000746.1       AUGUSTUS        CDS     834     878     0.96    -       2       Parent=jg3.t1
CAKLNU010000746.1       AUGUSTUS        CDS     988     1011    1       -       2       Parent=jg3.t1
CAKLNU010000746.1       AUGUSTUS        CDS     1310    1336    1       -       2       Parent=jg3.t1
CAKLNU010000746.1       AUGUSTUS        CDS     2483    2518    1       -       2       Parent=jg3.t1
CAKLNU010000746.1       AUGUSTUS        CDS     2597    2695    1       -       2       Parent=jg3.t1

Thanks!

DiegoSafian avatar Dec 27 '22 12:12 DiegoSafian

@DiegoSafian apologies, I completely missed this issue. Can you try removing the locus features from your gff3 file, to see if that is what the ID generator is erroring out on?

mpoelchau avatar Mar 06 '23 14:03 mpoelchau