Length Limit stops working past a certain point
My pipeline implements UMI processing, adapter trimming, and setting a length limit via fastp for paired end reads. Because of the chemistry involved, our maximum read length is 167 nucleotides. If I set the length limit to 141 nucleotides or shorter, the program behaves as expected, returning only reads shorter than or equal to 141 nucleotides. Once we filter for 142 nucleotides or longer, the entirety of the fastq files are returned with no length filtering at all. The magic number seems to be 25. If the length is specified is within 25 nucleotides of the maximum read length, it will return all reads.
-
When in the process will fastp be filtering for read length? (I'm assuming after UMI removal and adapter cleaving, but I'd like to confirm that. Please note: we are not trimming the read length, we are filtering for a maximum read length, and I could not find when that occurs specified in the documentation.)
-
We use Illumina MiSeq adapters (which we specify the sequence for), which are 33 nucleotides long, and an 8 nucleotide long UMI. 33 - 8 = 25, So I'm wondering if the UMI removal and adapter cleaving have anything to do with the length limit problem, and if anyone has any insight into this.
Thank you for any help you can provide!