KeyError vogdb categories
Hello,
I was running dram_v for seven genomes, all passed but one failed, with this error message:
/fs02/ie79/Zarul/prophage_detection/.snakemake/conda/a41f0976fb8154f5853379be34959793/lib/python3.10/site-packages/mag_annotator/annotate_bins.py:603: UserWarning: No rRNAs were detected, no rrnas.tsv file will be created.
warnings.warn('No rRNAs were detected, no rrnas.tsv file will be created.')
2022-05-07 17:54:36.669004: Viral annotation started
0:00:00.263488: Retrieved database locations and descriptions
0:00:00.263541: Annotating final-viral-combined-for-dramv
0:00:00.422184: Turning genes from prodigal to mmseqs2 db
0:00:02.355397: Getting hits from kofam
0:01:01.380347: Getting forward best hits from viral
0:01:04.454533: Getting forward best hits from peptidase
0:01:08.947004: Getting hits from pfam
0:01:22.763530: Getting hits from dbCAN
0:01:29.084671: Getting hits from VOGDB
0:02:08.138741: Merging ORF annotations
0:02:09.722065: Annotations complete, processing annotations
0:02:09.738307: Annotations complete, assigning auxiliary scores and flags
0:02:09.831099: Completed annotations
0:00:00.026330: Retrieved database locations and descriptions
0:00:00.027495: Determined potential amgs
Traceback (most recent call last):
File "/fs02/ie79/Zarul/prophage_detection/.snakemake/conda/a41f0976fb8154f5853379be34959793/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3621, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 136, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 163, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'vogdb_categories'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/fs02/ie79/Zarul/prophage_detection/.snakemake/conda/a41f0976fb8154f5853379be34959793/bin/DRAM-v.py", line 153, in <module>
args.func(**args_dict)
File "/fs02/ie79/Zarul/prophage_detection/.snakemake/conda/a41f0976fb8154f5853379be34959793/lib/python3.10/site-packages/mag_annotator/summarize_vgfs.py", line 240, in summarize_vgfs
viral_genome_stats = make_viral_stats_table(annotations, potential_amgs, groupby_column)
File "/fs02/ie79/Zarul/prophage_detection/.snakemake/conda/a41f0976fb8154f5853379be34959793/lib/python3.10/site-packages/mag_annotator/summarize_vgfs.py", line 92, in make_viral_stats_table
for i in frame['vogdb_categories']]) / frame.shape[0]
File "/fs02/ie79/Zarul/prophage_detection/.snakemake/conda/a41f0976fb8154f5853379be34959793/lib/python3.10/site-packages/pandas/core/frame.py", line 3505, in __getitem__
indexer = self.columns.get_loc(key)
File "/fs02/ie79/Zarul/prophage_detection/.snakemake/conda/a41f0976fb8154f5853379be34959793/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3623, in get_loc
raise KeyError(key) from err
KeyError: 'vogdb_categories'
Thank you.
Hmm, It looks like your vogdb is missing its description file. Can you run DRAM-setup.py print_config and post the output?
Here you go:
Processed search databases
KEGG db: None
KOfam db: /fs03/ie79/db/DRAM_data/kofam_profiles.hmm
KOfam KO list: /fs03/ie79/db/DRAM_data/kofam_ko_list.tsv
UniRef db: None
Pfam db: /fs03/ie79/db/DRAM_data/pfam.mmspro
dbCAN db: /fs03/ie79/db/DRAM_data/dbCAN-HMMdb-V9.txt
RefSeq Viral db: /fs03/ie79/db/DRAM_data/refseq_viral.20210917.mmsdb
MEROPS peptidase db: /fs03/ie79/db/DRAM_data/peptidases.20210917.mmsdb
VOGDB db: /fs03/ie79/db/DRAM_data/vog_latest_hmms.txt
Descriptions of search database entries
Pfam hmm dat: /fs03/ie79/db/DRAM_data/Pfam-A.hmm.dat.gz
dbCAN family activities: /fs03/ie79/db/DRAM_data/CAZyDB.07302020.fam-activities.txt
VOG annotations: /fs03/ie79/db/DRAM_data/vog_annotations_latest.tsv.gz
Description db: /fs03/ie79/db/DRAM_data/description_db.sqlite
DRAM distillation sheets
Genome summary form: /fs03/ie79/db/DRAM_data/genome_summary_form.20210917.tsv
Module step form: /fs03/ie79/db/DRAM_data/module_step_form.20210917.tsv
ETC module database: /fs03/ie79/db/DRAM_data/etc_mdoule_database.20210917.tsv
Function heatmap form: /fs03/ie79/db/DRAM_data/function_heatmap_form.20210917.tsv
AMG database: /fs03/ie79/db/DRAM_data/amg_database.20210917.tsv
And I have checked, all the files should be present
for i in /fs03/ie79/db/DRAM_data/kofam_profiles.hmm /fs03/ie79/db/DRAM_data/kofam_ko_list.tsv /fs03/ie79/db/DRAM_data/pfam.mmspro /fs03/ie79/db/DRAM_data/dbCAN-HMMdb-V9.txt /fs03/ie79/db/DRAM_data/refseq_viral.20210917.mmsdb /fs03/ie79/db/DRAM_data/peptidases.20210917.mmsdb /fs03/ie79/db/DRAM_data/vog_latest_hmms.txt /fs03/ie79/db/DRAM_data/Pfam-A.hmm.dat.gz /fs03/ie79/db/DRAM_data/CAZyDB.07302020.fam-activities.txt /fs03/ie79/db/DRAM_data/vog_annotations_latest.tsv.gz /fs03/ie79/db/DRAM_data/description_db.sqlite /fs03/ie79/db/DRAM_data/genome_summary_form.20210917.tsv /fs03/ie79/db/DRAM_data/module_step_form.20210917.tsv /fs03/ie79/db/DRAM_data/etc_mdoule_database.20210917.tsv /fs03/ie79/db/DRAM_data/function_heatmap_form.20210917.tsv /fs03/ie79/db/DRAM_data/amg_database.20210917.tsv ; do ls $i ; done
/fs03/ie79/db/DRAM_data/kofam_profiles.hmm
/fs03/ie79/db/DRAM_data/kofam_ko_list.tsv
/fs03/ie79/db/DRAM_data/pfam.mmspro
/fs03/ie79/db/DRAM_data/dbCAN-HMMdb-V9.txt
/fs03/ie79/db/DRAM_data/refseq_viral.20210917.mmsdb
/fs03/ie79/db/DRAM_data/peptidases.20210917.mmsdb
/fs03/ie79/db/DRAM_data/vog_latest_hmms.txt
/fs03/ie79/db/DRAM_data/Pfam-A.hmm.dat.gz
/fs03/ie79/db/DRAM_data/CAZyDB.07302020.fam-activities.txt
/fs03/ie79/db/DRAM_data/vog_annotations_latest.tsv.gz
/fs03/ie79/db/DRAM_data/description_db.sqlite
/fs03/ie79/db/DRAM_data/genome_summary_form.20210917.tsv
/fs03/ie79/db/DRAM_data/module_step_form.20210917.tsv
/fs03/ie79/db/DRAM_data/etc_mdoule_database.20210917.tsv
/fs03/ie79/db/DRAM_data/function_heatmap_form.20210917.tsv
/fs03/ie79/db/DRAM_data/amg_database.20210917.tsv
Unfortunately, the problem is most likely that vogDB had 0 hits for that category. I will work on pushing a fix, but in the meantime the problem could be solved quickly by adding an empty column to your annotations file.
Let me know if the latest update DRAM1.4 does not fix your issue, and re-open this ticket. I am closing for now because I think it is solved now.