cnvkit icon indicating copy to clipboard operation
cnvkit copied to clipboard

the default bin size of batch --method wgs

Open deb0612 opened this issue 5 years ago • 3 comments

Dear sir, I tried to apply "cnvkit.py batch Sample1.bam -n Control1.bam -m wgs -f hg38.fasta --annotate refFlat.txt" to my WGS data. When use gencode.v37.annotation as target bed, we got very large segment. However, we need to smaller the segment size. We would like to know what the default bin size.

deb0612 avatar Apr 21 '21 04:04 deb0612

Hi @deb0612 ,

Not an author of CNVkit, but I suggest you to look carefully into CNVkit's output of your command and you should see a line like: WGS average depth <FLOAT> --> using bin size<THE_NUMBER_YOU_WANT>

HOWEVER tweaking the default bin size may not be the best way to make your segment smaller => Maybe you should rather try another segmentation method (default = "CBS") => If you are using CNVkit >= v0.9.7, you have a --segment-method parameter that allows you to switch easily

Hope this helps. Have a nice day ! Felix.

tetedange13 avatar Apr 22 '21 14:04 tetedange13

As @tetedange13 said (by the way, thank you, those are very helpful tips!), the automatically determined bin size should be fine in almost all situations and is unlikely to significantly influence the resulting segments.

And @deb0612, just to confirm, what do you mean by a very large segment? It's worth noting that, since the majority of the genome is not affected by CNVs, cnvkit.py segment will usually output lots and lots of very large segments (tens of millions of bases long) with normal log2 (≈0), and those are expected.

To investigate further, could you share the CNS file please (or its relevant portion)?

tskir avatar Apr 24 '21 03:04 tskir

For WGS, definitely use segmetrics and call to filter by CI. Then it may be more practical to use bintest or genemetrics to extract only focal CNVs or those that affect genes of interest.

etal avatar May 26 '21 20:05 etal