CheckM icon indicating copy to clipboard operation
CheckM copied to clipboard

The effect of non-prokaryotes sequence

Open neptuneyt opened this issue 3 years ago • 2 comments

The effect of non-prokaryotes sequence

Dear checkM team, thanks a lot for developing such a wonderful software. Here I have a question, checkm identifies contamination and integrity based on marker genes of prokaryotes, if a high quality bacterial genome is accidentally mixed with sequences of eukaryotes (e.g. fungi, algae or eukaryotic hosts), viruses, then it still assesses high quality, I tested it and it does. Hopefully, the effect of these non-prokaryotes will be considered in the next version. Best wishes!

neptuneyt avatar Nov 01 '22 10:11 neptuneyt

I am a new user of CheckM, but I vaguely remember something about contamination in the output as well. Theoretically, given that backbone of CheckM is a set of HMM marker sets for various taxonomies, one can detect contamination by hitting something belonging to a significantly different marker set. One just needs to pick a suitable workflow I guess

azat-badretdin avatar Nov 01 '22 12:11 azat-badretdin

CheckM2 is almost certainly a better option if eukaryotic or viral contamination is a concern: https://github.com/chklovski/CheckM2

donovan-h-parks avatar Nov 01 '22 14:11 donovan-h-parks