PBSIM-PacBio-Simulator icon indicating copy to clipboard operation
PBSIM-PacBio-Simulator copied to clipboard

--length-max ignored or toned down?

Open FiniDG opened this issue 7 years ago • 1 comments

I have a question, because I ran the command: pbsim --data-type CLR --depth 40 --length-min 500 --length-mean 12000 --length-max 40000 --accuracy-min 0.85 --accuracy-mean 0.87 --model_qc /PBSIM-PacBio-Simulator/data/model_qc_clr /path/to/ref.fasta --prefix /path/to/output/ and pbsim --data-type CLR --depth 40 --length-min 500 --length-mean 12000 --length-max 70000 --accuracy-min 0.85 --accuracy-mean 0.87 --model_qc /PBSIM-PacBio-Simulator/data/model_qc_clr /path/to/ref.fasta --prefix /path/to/output/

And still (whatever I fill in as --length-max), My largest read is around 26000

What is the reason for this? the ref.fasta I used is only 14 Mb in size, so might that be a reason? Or does PBSIM wants to make a specific read length distribution?

FiniDG avatar Dec 04 '18 07:12 FiniDG

After further investigation I think I found the answer. I put my thoughts here for future users. You also have a parameter called --length-sd, which is set to default 2300 for CLR data-type. I believe this is the reason.

FiniDG avatar Dec 11 '18 11:12 FiniDG