Add SNP calling
It would be nice to be able to have the option of calling variants from bisulfite data.
It shouldn't be too tricky to add Bis-SNP or something similar as a new opt-in process. There may be other / better tools also?
Felix Krueger mentioned four different packages for that purpose. Bis-SNP, MethylExtract, BS-SNPer and CGmapTools. Also, BScall can do.
Also bit different stuff, from Wreczycka et all paper 2017:
"the majority of CpGs with high inter-population differences contain common genomic SNPs (minor allele frequency > 0.01) (Daca-Roszaket al., 2015). To ensure more reliable interpretation of the data we advise removing known C/T SNPs which can interfere with methylation calls."
It would be also nice to have a dictionary with these sites for human and possibility of removing it, if desired (--remove.common_snps).
Variant calls could be also derived from matched genome sequencing data or public databases such as dbSNP (https://www.ncbi.nlm.nih.gov/projects/SNP/dbSNP.cgi?list=sslist)
Ooh, @FelixKrueger? I wouldn't trust that guy.. 😆 Yes all sounds good - does anyone have a favourite tool?
The common SNPs feature would be nice, but I guess that's a separate issue as it doesn't require SNP calling, it's just a filtering step right? Do such lists already exist somewhere? Perhaps we can generate such a list from a VCF file in the pipeline. Then we could use the files available for multiple species already in iGenomes.
I think that matching to WGS and external databases is perhaps beyond the scope of this pipeline for now. If the pipeline produces a VCF it shouldn't be too difficult for people to play with this anyway. We could perhaps even make a separate nf-core pipeline for doing pairwise comparison / QC of VCF files...
I agree, it might be a nice pipeline to have. The tools mentioned above were - of course (in good old bioinformatics manner) - shown to be much superior to previously published tools. We don't personally use SNP exclusion on a regular basis, so I am not sure which one is best/easiest to implement.
On a slightly different note, would anyone object if we dropped Bowtie (1) from Bismark, and added HISAT2 instead?
Sure - go for it! Alignment speed can be one of the main annoyances with Bismark so a faster tool with comparable output would be great 👍 (though does this mean that I have to update the --relaxMismatches code? 😱 )
Hi, was this ever implemented or is there a fork that some work was done on?