zarp icon indicating copy to clipboard operation
zarp copied to clipboard

feat: Remove PCR duplicates

Open ninsch3000 opened this issue 3 years ago • 0 comments

Is your feature request related to a problem? Please describe. Remove PCR duplicates with a third-party tool.

Describe the solution you'd like Suggestions from the old discussion

  • https://umi-tools.readthedocs.io/en/latest/: @mkatsanto: hard to identify UMIs due to lack of common format; not feasible
  • https://sourceforge.net/projects/ngsreadstreatment/: @AngryMaciek tested this extensively, runtime 40h on a paired end sample, rather not suitable for zarp
  • samtools-markdup: http://www.htslib.org/doc/samtools-markdup.html (suggestion by @fgypas )
  • MarkDuplicates from picard tools: http://broadinstitute.github.io/picard/command-line-overview.html#MarkDuplicates and https://gatk.broadinstitute.org/hc/en-us/articles/360037052812-MarkDuplicates-Picard- (suggestion by @fgypas )(@mkatsanto: this tool would be suitable but works on bam files, which would have to be reverted to FASTQ afterwards to allow other ZARP steps to work)

Additional context This has been extensively discussed in the old gitlab repository, with no clear conclusion but a trend towards omitting this feature. Saved here mainly for documentation and links in case the discussion reopens.

ninsch3000 avatar Mar 10 '23 16:03 ninsch3000