DWGSIM icon indicating copy to clipboard operation
DWGSIM copied to clipboard

Generating reads with even 10X coverage for WGS is taking too long

Open tnnandi opened this issue 3 years ago • 1 comments

Hi,

I'm trying to generate reads with 10X coverage for a genome with a size of 3 billion bp and a read length of 2x150 bp. This leads to a requirement of 60 million reads and it is turning out to be almost impossible to generate all the reads using a single CPU. Is there a way to accelerate this reads generation process?

Thank you very much.

tnnandi avatar Dec 24 '22 04:12 tnnandi

Split your genome up into chromosomes (or contigs), run dwgsim on each separately, and then concatenate the FASTQ (and mutation) files. If you didn't already know, you can directly cat the gz files.

davetang avatar Jun 06 '23 08:06 davetang