I am not able to interpret the results
Hi, thanks for a so nice tool. I have sequenced several single-cell samples of a protist species.
I have troubles understanding my smudgeplot. I have used follwing command to generate it FastK -v -t4 -k40 -M16 -T16 78bam_[12].fastq.gz -N78 smudgeplot.py hetmers -L 12 -t 8 -o 78 --verbose 78.ktab smudgeplot.py all -o 78_SMUDGE 78_text.smu I am running version 0.4.0
and it look like this:
and de genomescope 2 histogram looks like:
For instance, another sample:
The other samples look very similar.
Only two of the samples look more or less well:
and this one:
How should I understand my smudgeplots?
thank you very much
Hi, sequencing protists is hard!
If you look at the examples here https://pubmed.ncbi.nlm.nih.gov/39890468/ we show how "nice" and "messy" spectra look like. In the first two cases, did you not sequence your target well. There might be sequences from your target, but definitely not sequenced nicely on its own, with protists the most likely reason will be contamination that will very likely be overtaking the runs...
THe later two look a bit better, but they are still rather messy. The 3 shows two peaks and to me it's rather unclear which of them is your protist (they can't both be - they would be following the rules of stochiometry, see that paper for more details). I suspect the smudgeplot would not show you much if the right bump is a bacteria (more likely given the genome size) and the left one is your target (you filtered out everything with coverage <12x, so that means you filtered most of your target before making the smudgeplots).
Good news for you is that ploidy is usually not an issue for protists. (as long as it is not an amoeba or something). Even when they are polyploid, they usually have small enough genomes to assemble nicely. I would just try to stich the genomes together and run it thought blobtool kit to see what you have actually done. I would recommend to start with those that show peaks first, because for those you have at least a bit of an expecation for coverage and genome size.
HOpe this helps.
Thank you very, very much for your comments, I will read that paper, it looks like it could be quite useful. I will do you said to me. Anyway, could be the protist genome haploid (or monoploid)?, there is only one peak and a very low heterozigosity, ranging from 0.0008 to 0.01. What is your oppinion? all the Best
Did you sequence a culture? It might be inbred anyway - meaning, the ploidy does not matter, you would see only one haplotype anyway.
The samples are single-cell Greetings
What do you mean by single cell? You mean literally single cell amplified and sequenced? I am not aware of any technology that would allow us to meaningfully do that... How did you do it?!
Or do you mean that you had a cell suspension and you had individual cells tagged for sequencing? That I am sure would work, but I have never seen any data like this, so would be quite curious to hear about it...