gencore icon indicating copy to clipboard operation
gencore copied to clipboard

Mismatched UMI of a pair of reads

Open litun-fkby opened this issue 3 years ago • 1 comments

hello, I meet some mistake with the gencore (Version: 0.17.2): 1 contigs in the bam file: chr1: 7249 bp

Mismatched UMI of a pair of reads Left: 0:0, M:0:0 TLEN:0 ID:0 E100030802L1C001R00101395324:umi_CA_NN TTTTTTTTTTTTTTTTTTTTTATTTTTTTTTTATTATTAATATTATTATTAATTTTAAAATCACAACAAAATACAAAAAACAAAAAAAAAAA GGGGGHHGGHIGGGGGHGGGIIGGGGGGGGGGIGGGGHGIHGGGGGGHGGGGGGGGIGGGGGGHGGGHIGIGGGGGIIGGHHGGGGHIGGHG Right: 0:0, M:0:0 TLEN:0 ID:0 E100030802L1C004R03200085905:umi_CG_NN TTAAAAAATTATAAAAAAAAAATAAAAAAATAAAAAAAATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG ERROR: The UMI of a read pair should be identical, but we got CA_ and CG_ d4c5c8bb77b6f6844967b26514807c8

but, when i find the reads in original bam ,the reads pair seem to be correct with same UMI : 573b71043a38b526bd4de45b9a382a5

it's seem like the gencore consider the two read with different read name to be one pair read.

the command is: gencore --umi_prefix=umi -s 3 --ref hg19.fasta --quit_after_contig 25 -i input.bam -o output.umi.bam

but when I add the "-d" parameter,there is no mistake , maybe there has some difference,: gencore --umi_prefix=umi -d 0 -s 3 --ref hg19.fasta --quit_after_contig 25 -i input.bam -o output.umi.bam

Looking forward to your reply!

litun-fkby avatar Mar 07 '22 08:03 litun-fkby

I have met the same problem, looking forward other response.

Besides, I have tried the following method to solve it: before running the genecore , add this cmd: samtools view -@ $threads -bh -f 2 in.bam > out.bam to remove the unpair reads, and it works ok.

baoyl818 avatar Sep 20 '22 03:09 baoyl818