slimm
slimm copied to clipboard
Question: How does slimm deal with discordant mapping of paired end reads?
Dear @temehi & @agakrawczyk ,
@agakrawczyk and me ran in problems when using bowtie2 and slimm:
- Yara, bowtie2 and slimm indizes with Human chr1 & chr11 and C-RVDB (https://rvdb.dbi.udel.edu/) were built.
- An in silico data set comprising 91% Human chr1 & chr11 reads and 9% Human virus reads of various species was generated and mapped with bowtie2 or yara.
- Slimm was used for abundance estimation for bowtie2 or yara mappings.
While "yara+slimm" gave consistent results for all assayed viruses, "bowtie2+slimm" did fail for two viruses. Closer inspection on mapping files showed that bowtie2 reported many discordant mapping across various reference sequences:
MK630134.1_50_0 97 KY315545.1 7713 1 301M KY315552.1 8791 0 CCAGGTCCAGTCAGATAATAAATATCCGATAAGGAACAAAAGGAAAAGTCAAAGTCCTGGAAAGCATCCACCCTGATTCTCTTGCCGGAATGTTTTGCCACGTAATCCTTCATAGCAGGGAGATTCCTCTGTAAAAGGATAAATTCCTCCCGGCCGCATTTAGTCTCTCCGAACCACATCTTTTCCTGCTCTGGATCTAACTCCGCTTCGGCAAAAGGCGGAAATTGTCTAAGTCCAATATTGAAGAACTGGACCAATGATTCTGCTATGATGTACACTTTTTCCGTCACCGTGTCCACGG CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGFGGGGGGGGGGGGGGDGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGDGGFGGGGGGGGG9FGGGFGGGGGFGGGGG<GCGGFGGFAGCGGG@GGGGEFGEFDGGGGD9FGGGGGGAGGEFFFCGFGFDAEEFF@FF@8GGECFCDFG7;@@EGFGG*AFCFG,FGE:EF@F,:@G*G?DFDF7*FGCGGCC9EFE+FDG)FC78FFG6<2GG73;+<AA2F881)1)5.)8.0:).+AF09:F9/* AS:i:-2 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:281C19 YT:Z:UP
MK630134.1_50_0 353 KY290183.1 7394 1 301M KY315552.1 8791 0 CCAGGTCCAGTCAGATAATAAATATCCGATAAGGAACAAAAGGAAAAGTCAAAGTCCTGGAAAGCATCCACCCTGATTCTCTTGCCGGAATGTTTTGCCACGTAATCCTTCATAGCAGGGAGATTCCTCTGTAAAAGGATAAATTCCTCCCGGCCGCATTTAGTCTCTCCGAACCACATCTTTTCCTGCTCTGGATCTAACTCCGCTTCGGCAAAAGGCGGAAATTGTCTAAGTCCAATATTGAAGAACTGGACCAATGATTCTGCTATGATGTACACTTTTTCCGTCACCGTGTCCACGG CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGFGGGGGGGGGGGGGGDGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGDGGFGGGGGGGGG9FGGGFGGGGGFGGGGG<GCGGFGGFAGCGGG@GGGGEFGEFDGGGGD9FGGGGGGAGGEFFFCGFGFDAEEFF@FF@8GGECFCDFG7;@@EGFGG*AFCFG,FGE:EF@F,:@G*G?DFDF7*FGCGGCC9EFE+FDG)FC78FFG6<2GG73;+<AA2F881)1)5.)8.0:).+AF09:F9/* AS:i:-2 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:281C19 YT:Z:UP
MK630134.1_50_0 353 KY316048.1 11168 1 301M KY315552.1 8791 0 CCAGGTCCAGTCAGATAATAAATATCCGATAAGGAACAAAAGGAAAAGTCAAAGTCCTGGAAAGCATCCACCCTGATTCTCTTGCCGGAATGTTTTGCCACGTAATCCTTCATAGCAGGGAGATTCCTCTGTAAAAGGATAAATTCCTCCCGGCCGCATTTAGTCTCTCCGAACCACATCTTTTCCTGCTCTGGATCTAACTCCGCTTCGGCAAAAGGCGGAAATTGTCTAAGTCCAATATTGAAGAACTGGACCAATGATTCTGCTATGATGTACACTTTTTCCGTCACCGTGTCCACGG CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGFGGGGGGGGGGGGGGDGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGDGGFGGGGGGGGG9FGGGFGGGGGFGGGGG<GCGGFGGFAGCGGG@GGGGEFGEFDGGGGD9FGGGGGGAGGEFFFCGFGFDAEEFF@FF@8GGECFCDFG7;@@EGFGG*AFCFG,FGE:EF@F,:@G*G?DFDF7*FGCGGCC9EFE+FDG)FC78FFG6<2GG73;+<AA2F881)1)5.)8.0:).+AF09:F9/* AS:i:-2 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:281C19 YT:Z:UP
MK630134.1_50_0 353 KY274508.1 7501 1 301M KY315552.1 8791 0 CCAGGTCCAGTCAGATAATAAATATCCGATAAGGAACAAAAGGAAAAGTCAAAGTCCTGGAAAGCATCCACCCTGATTCTCTTGCCGGAATGTTTTGCCACGTAATCCTTCATAGCAGGGAGATTCCTCTGTAAAAGGATAAATTCCTCCCGGCCGCATTTAGTCTCTCCGAACCACATCTTTTCCTGCTCTGGATCTAACTCCGCTTCGGCAAAAGGCGGAAATTGTCTAAGTCCAATATTGAAGAACTGGACCAATGATTCTGCTATGATGTACACTTTTTCCGTCACCGTGTCCACGG CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGFGGGGGGGGGGGGGGDGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGDGGFGGGGGGGGG9FGGGFGGGGGFGGGGG<GCGGFGGFAGCGGG@GGGGEFGEFDGGGGD9FGGGGGGAGGEFFFCGFGFDAEEFF@FF@8GGECFCDFG7;@@EGFGG*AFCFG,FGE:EF@F,:@G*G?DFDF7*FGCGGCC9EFE+FDG)FC78FFG6<2GG73;+<AA2F881)1)5.)8.0:).+AF09:F9/* AS:i:-2 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:281C19 YT:Z:UP
MK630134.1_50_0 353 MH698400.1 15286 1 301M KY315552.1 8791 0 CCAGGTCCAGTCAGATAATAAATATCCGATAAGGAACAAAAGGAAAAGTCAAAGTCCTGGAAAGCATCCACCCTGATTCTCTTGCCGGAATGTTTTGCCACGTAATCCTTCATAGCAGGGAGATTCCTCTGTAAAAGGATAAATTCCTCCCGGCCGCATTTAGTCTCTCCGAACCACATCTTTTCCTGCTCTGGATCTAACTCCGCTTCGGCAAAAGGCGGAAATTGTCTAAGTCCAATATTGAAGAACTGGACCAATGATTCTGCTATGATGTACACTTTTTCCGTCACCGTGTCCACGG CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGFGGGGGGGGGGGGGGDGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGDGGFGGGGGGGGG9FGGGFGGGGGFGGGGG<GCGGFGGFAGCGGG@GGGGEFGEFDGGGGD9FGGGGGGAGGEFFFCGFGFDAEEFF@FF@8GGECFCDFG7;@@EGFGG*AFCFG,FGE:EF@F,:@G*G?DFDF7*FGCGGCC9EFE+FDG)FC78FFG6<2GG73;+<AA2F881)1)5.)8.0:).+AF09:F9/* AS:i:-2 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:281C19 YT:Z:UP
MK630134.1_50_0 353 KY315552.1 7711 1 301M = 8791 0 CCAGGTCCAGTCAGATAATAAATATCCGATAAGGAACAAAAGGAAAAGTCAAAGTCCTGGAAAGCATCCACCCTGATTCTCTTGCCGGAATGTTTTGCCACGTAATCCTTCATAGCAGGGAGATTCCTCTGTAAAAGGATAAATTCCTCCCGGCCGCATTTAGTCTCTCCGAACCACATCTTTTCCTGCTCTGGATCTAACTCCGCTTCGGCAAAAGGCGGAAATTGTCTAAGTCCAATATTGAAGAACTGGACCAATGATTCTGCTATGATGTACACTTTTTCCGTCACCGTGTCCACGG CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGFGGGGGGGGGGGGGGDGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGDGGFGGGGGGGGG9FGGGFGGGGGFGGGGG<GCGGFGGFAGCGGG@GGGGEFGEFDGGGGD9FGGGGGGAGGEFFFCGFGFDAEEFF@FF@8GGECFCDFG7;@@EGFGG*AFCFG,FGE:EF@F,:@G*G?DFDF7*FGCGGCC9EFE+FDG)FC78FFG6<2GG73;+<AA2F881)1)5.)8.0:).+AF09:F9/* AS:i:-2 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:281C19 YT:Z:UP
Yara did not report these discordant mappings.
My question: How does slimm deal with discordant mappings? My suspection is that these are discarded.
Thanks in advance!