--length-max ignored or toned down?
I have a question, because I ran the command:
pbsim --data-type CLR --depth 40 --length-min 500 --length-mean 12000 --length-max 40000 --accuracy-min 0.85 --accuracy-mean 0.87 --model_qc /PBSIM-PacBio-Simulator/data/model_qc_clr /path/to/ref.fasta --prefix /path/to/output/
and
pbsim --data-type CLR --depth 40 --length-min 500 --length-mean 12000 --length-max 70000 --accuracy-min 0.85 --accuracy-mean 0.87 --model_qc /PBSIM-PacBio-Simulator/data/model_qc_clr /path/to/ref.fasta --prefix /path/to/output/
And still (whatever I fill in as --length-max), My largest read is around 26000
What is the reason for this? the ref.fasta I used is only 14 Mb in size, so might that be a reason? Or does PBSIM wants to make a specific read length distribution?
After further investigation I think I found the answer. I put my thoughts here for future users. You also have a parameter called --length-sd, which is set to default 2300 for CLR data-type. I believe this is the reason.