CheckM icon indicating copy to clipboard operation
CheckM copied to clipboard

checkm.hmm

Open yangfangming opened this issue 3 years ago • 2 comments

I have an issue with "checkm.hmm". I noticed that there are some phage-related HMMs in "checkm.hmm" file. If possible, how can I get the bacterial or archaea HMMs only? Thanks!

when I grep "Phage" from "checkm.hmm", I get something like this: NAME Phage_min_tail DESC Phage minor tail protein NAME Phage_tail_T NAME Phage_tube DESC Phage tail tube protein FII NAME Phage_CP76 DESC Phage regulatory protein CII (CP76) NAME Phage_P2_GpU DESC Phage P2 GpU NAME Phage_rep_org_N

yangfangming avatar Sep 29 '22 05:09 yangfangming

Hi. CheckM uses different marker sets depending on where your genome is placed in a reference tree. There is no single bacterial or archaeal marker set. You can get lists of the taxon-specific marker sets used by CheckM's taxonomy_wf (not the typical way to run CheckM!) in the taxon_marker_sets.tsv file.

donovan-h-parks avatar Sep 29 '22 14:09 donovan-h-parks

The marker sets used for different placements in the CheckM reference tree are given in the genome_tree/genome_tree.metadata.tsv file. This is not easy to parse by a human though.

donovan-h-parks avatar Sep 29 '22 14:09 donovan-h-parks