DRAM icon indicating copy to clipboard operation
DRAM copied to clipboard

DRAM-v.py --low_mem_mode breaks distill

Open mlhoggard opened this issue 2 years ago • 0 comments

Hi there,

We have DRAM (1.4.6) set up with the full KEGG database, which has been working ok. But I recently wanted to run some viral contigs using kofam instead of full KEGG via DRAM-v.py annotate --low_mem_mode. The annotate step completed, but then DRAM-v.py distill gave the error: KeyError: 'vogdb_categories'

Checking annotation.tsv, vogdb_categories is indeed missing. But a similar run without --low_mem_mode did have vogdb_categories (and that worked with distill).

I tracked down this section in database_handler.py, where the --low_mem_mode settings are applied:

if low_mem_mode:
            if ("kofam_hmm" not in self.config.get("search_databases")) or (
                "kofam_ko_list" not in self.config.get("search_databases")
            ):
                raise ValueError(
                    "To run in low memory mode KOfam must be configured for use in DRAM"
                )
            dbs_to_use = [i for i in dbs_to_use if i not in ("uniref", "kegg", "vogdb")]

So, evidently (based on the last line above) --low_mem_mode always excludes vogdb as well as uniref and kegg (although the help message only notes the latter two), but vogdb is required for DRAM-v's version of distill.

Is it possible to have --low_mem_mode behave differently for DRAM.py and DRAM-v.py, so that the latter doesn't exclude vogdb? Or alternatively, remove vogdb from that --low_mem_mode exclusion list above, but add an additional --exclude_vogdb option seperately?

Cheers! Mike.

mlhoggard avatar Jul 06 '23 05:07 mlhoggard