CRISPResso2 icon indicating copy to clipboard operation
CRISPResso2 copied to clipboard

CRISPRessoWGS : Reads with 3bp deletions not aligning

Open Salome-Brunon opened this issue 1 year ago • 8 comments

Hello,

I am running CRISPRessoWGS on Illumina data for a list of gRNA target genes (all 20bps). I have noticed something odd for one of my gRNA as only one read mapped to my sequence. I had a look at IGV and the start of the sequence contains a 3bp deletions so none of the reads are considered as mapping to that gRNA. I have other cases where a 2bp or 1bp deletion at the start/ end does not cause such issue. I am not sure how to correct for that using the available parameters. What do you think?

This the CRISPResso output bam of the region in question: Image

This is the command I ran:

CRISPRessoWGS \
    -b "$bam_path" \
    -f "$guide_file" \
    -r "$ref_file" \
    --exclude_bp_from_right 0 \
    --exclude_bp_from_left 0 \
    --bam_output \
    --name "$out_name"

My ref file looks like this:

2	22960772	22960792	Abi1_1
2	22971105	22971125	Abi1_2
2	22953547	22953567	Abi1_3
2	22962404	22962424	Abi1_4
2	26481561	26481581	Notch1_1
2	26469921	26469941	Notch1_2
2	26468403	26468423	Notch1_3
2	26476383	26476403	Notch1_4

Thank you!

Salome-Brunon avatar Apr 24 '25 14:04 Salome-Brunon

Hi @Salome-Brunon I'm not sure exactly what's going on here. If you slide your ref file to the left (2 26469920 26469940 Notch1_2) does it work?

kclem avatar Apr 28 '25 17:04 kclem

Thank you for the suggestion, it does work if I slide it to the left. How should I deal with on for the rest of my gRNAs then? Should I include 1bp to both ends so they are 22bp long? This is not the only gRNA where I have this issue.

Salome-Brunon avatar Apr 29 '25 08:04 Salome-Brunon

Hello, I have been looking at the differences in my results if I run CRISPRessoWGS with a 22bp gRNA sequence vs 20bp and I have observed another strange result. There seems to be a sequence shift in where my deletion is found.

This is the modifications found for my 20bp gRNA Image

vs the modifications found for my 22bp gRNA (1bp added to both ends) Image

but neither results seem to be correct as this is what I observe in IGV: Image

Salome-Brunon avatar Apr 30 '25 11:04 Salome-Brunon

Hm. This is indeed strange.

CRISPRessoWGS produces output fastq files when it extracts sequences from your bams. These files are in the 'ANALYZED_REGIONS' folder. Do the extracted fastq sequences look correct?

kclem avatar May 06 '25 18:05 kclem

This is the fastq in ANALYZED_REGIONS < @A00708:783:HHV7GDSXC:3:2615:4029:24064_1 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:4:1665:11496:29825_2 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:3:1175:29441:23735_3 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:4:1441:24722:3724_4 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:1:1415:1452:36855_5 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:1:2512:15302:8766_6 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:1:1143:30825:3740_7 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:3:1576:14226:15139_8 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:1:2477:11749:2237_9 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:1:2515:6614:15013_10 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:2:1664:31548:14888_11 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:2:2470:19904:29230_12 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:3:1553:17815:11647_13 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:4:2225:25536:19006_14 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:3:2175:14045:2018_15 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:1:1453:4625:12132_16 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:3:1237:20030:34804_17 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:1:2375:9489:24784_18 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:1:1446:21251:13088_19 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:4:2278:29713:23829_20 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:4:2106:14760:36667_21 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:1:2114:8657:21089_22 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:3:2175:14045:2018_23 < CAGGGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:2:1576:10655:34131_24 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:4:2130:8449:1752_25 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:1:1143:30825:3740_26 < CAGTGTCAACTGCAGTCAGT < @A00708:783:HHV7GDSXC:3:1576:14226:15139_27 < CAGTGTCAACTGCAGTCAGT

This is the bam:

A00708:783:HHV7GDSXC:4:2251:7274:8202 CCTTAACCCTTCAGCCTGGAGCCATGCTTACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCC A00708:783:HHV7GDSXC:4:1343:12391:30561 ACCCTTCAGCCTGGAGCCATGCTTACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGT A00708:783:HHV7GDSXC:2:1470:5113:4523 CCTTCAGCCTGGAGCCATGCTTACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCA A00708:783:HHV7GDSXC:2:2469:1732:28040 CCTTCAGCCTGGAGCCATGCTTACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCA A00708:783:HHV7GDSXC:3:2260:17635:2879 TCAGCCTGGAGCCATGCTTACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACT A00708:783:HHV7GDSXC:3:2615:4029:24064 GCTTACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCC A00708:783:HHV7GDSXC:4:1665:11496:29825 GCTTACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCC A00708:783:HHV7GDSXC:3:1175:29441:23735 ACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGG A00708:783:HHV7GDSXC:4:1441:24722:3724 ACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGG A00708:783:HHV7GDSXC:1:1415:1452:36855 TTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGT A00708:783:HHV7GDSXC:1:2512:15302:8766 TTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGT A00708:783:HHV7GDSXC:1:1143:30825:3740 CCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGA A00708:783:HHV7GDSXC:3:1576:14226:15139 CCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGA A00708:783:HHV7GDSXC:1:2477:11749:2237 CCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCG A00708:783:HHV7GDSXC:1:2515:6614:15013 CTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTAC A00708:783:HHV7GDSXC:2:1664:31548:14888 TGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAG A00708:783:HHV7GDSXC:2:2470:19904:29230 GCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGG A00708:783:HHV7GDSXC:3:1553:17815:11647 GCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGG A00708:783:HHV7GDSXC:4:2225:25536:19006 CAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGA A00708:783:HHV7GDSXC:3:2175:14045:2018 CTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGG A00708:783:HHV7GDSXC:1:1453:4625:12132 TCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGT A00708:783:HHV7GDSXC:3:1237:20030:34804 TGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACC A00708:783:HHV7GDSXC:1:2375:9489:24784 GGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCC A00708:783:HHV7GDSXC:1:1446:21251:13088 GCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCA A00708:783:HHV7GDSXC:4:2278:29713:23829 CACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAA A00708:783:HHV7GDSXC:4:2106:14760:36667 ACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAAC A00708:783:HHV7GDSXC:1:2114:8657:21089 GGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAACCCCTTAC A00708:783:HHV7GDSXC:3:2175:14045:2018 GGGCCAGGGCCCACCCAGGGTCAACTGCAGTCAGTTCATCCGGGGCCAGGCGTGTGTGGCGGAGTGCCGCGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGCGGGGTTGGGGTGGGGCAGAGTACCCCGAGTACCCCTCAAG A00708:783:HHV7GDSXC:2:1576:10655:34131 CCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAACCCCTTACTACA A00708:783:HHV7GDSXC:4:2130:8449:1752 CCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAACCCCTTACTACA A00708:783:HHV7GDSXC:1:1143:30825:3740 CAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAACCCCTTACTACAC A00708:783:HHV7GDSXC:3:1576:14226:15139 CAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAACCCCTTACTACAC A00708:783:HHV7GDSXC:4:2161:7898:5556 TTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAACCCCTTACTACACCCACCTCTCTCACAGGCTCCCCAGGGAGTA

The output sequence in the fastqs looks correct, it is the reverse complement of my gRNA CTGACTGCAGTTGACACACT (found on the - strand)

Salome-Brunon avatar May 07 '25 14:05 Salome-Brunon

These are from the 22bp region, right? Do you mind posting the one from the 20bp region?

kclem avatar May 07 '25 17:05 kclem

Yes, those were 22bp regions. Here are the 20bp:

< @A00708:783:HHV7GDSXC:3:2615:4029:24064_1 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:4:1665:11496:29825_2 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:3:1175:29441:23735_3 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:4:1441:24722:3724_4 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:1:1415:1452:36855_5 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:1:2512:15302:8766_6 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:1:1143:30825:3740_7 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:3:1576:14226:15139_8 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:1:2477:11749:2237_9 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:1:2515:6614:15013_10 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:2:1664:31548:14888_11 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:2:2470:19904:29230_12 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:3:1553:17815:11647_13 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:4:2225:25536:19006_14 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:3:2175:14045:2018_15 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:1:1453:4625:12132_16 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:3:1237:20030:34804_17 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:1:2375:9489:24784_18 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:1:1446:21251:13088_19 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:4:2278:29713:23829_20 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:4:2106:14760:36667_21 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:1:2114:8657:21089_22 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:3:2175:14045:2018_23 < AGGGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:2:1576:10655:34131_24 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:4:2130:8449:1752_25 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:1:1143:30825:3740_26 < AGTGTCAACTGCAGTCAG < @A00708:783:HHV7GDSXC:3:1576:14226:15139_27 < AGTGTCAACTGCAGTCAG

A00708:783:HHV7GDSXC:4:1343:12391:30561 ACCCTTCAGCCTGGAGCCATGCTTACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGT A00708:783:HHV7GDSXC:2:1470:5113:4523 CCTTCAGCCTGGAGCCATGCTTACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCA A00708:783:HHV7GDSXC:2:2469:1732:28040 CCTTCAGCCTGGAGCCATGCTTACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCA A00708:783:HHV7GDSXC:3:2260:17635:2879 TCAGCCTGGAGCCATGCTTACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACT A00708:783:HHV7GDSXC:3:2615:4029:24064 GCTTACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCC A00708:783:HHV7GDSXC:4:1665:11496:29825 GCTTACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCC A00708:783:HHV7GDSXC:3:1175:29441:23735 ACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGG A00708:783:HHV7GDSXC:4:1441:24722:3724 ACAGCTTTTTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGG A00708:783:HHV7GDSXC:1:1415:1452:36855 TTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGT A00708:783:HHV7GDSXC:1:2512:15302:8766 TTCCCTTAGACCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGT A00708:783:HHV7GDSXC:1:1143:30825:3740 CCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGA A00708:783:HHV7GDSXC:3:1576:14226:15139 CCTGAGCCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGA A00708:783:HHV7GDSXC:1:2477:11749:2237 CCTAGCCAAACACTTCCTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCG A00708:783:HHV7GDSXC:1:2515:6614:15013 CTTCTCCTTGCTGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTAC A00708:783:HHV7GDSXC:2:1664:31548:14888 TGCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAG A00708:783:HHV7GDSXC:2:2470:19904:29230 GCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGG A00708:783:HHV7GDSXC:3:1553:17815:11647 GCAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGG A00708:783:HHV7GDSXC:4:2225:25536:19006 CAGGTCTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGA A00708:783:HHV7GDSXC:3:2175:14045:2018 CTTGAGGGCTTGGTCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGG A00708:783:HHV7GDSXC:1:1453:4625:12132 TCTGTAACTCACTGTGTGCCCGTGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGT A00708:783:HHV7GDSXC:3:1237:20030:34804 TGGGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACC A00708:783:HHV7GDSXC:1:2375:9489:24784 GGCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCC A00708:783:HHV7GDSXC:1:1446:21251:13088 GCACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCA A00708:783:HHV7GDSXC:4:2278:29713:23829 CACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAA A00708:783:HHV7GDSXC:4:2106:14760:36667 ACTGCTGGGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAAC A00708:783:HHV7GDSXC:1:2114:8657:21089 GGGGCCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAACCCCTTAC A00708:783:HHV7GDSXC:3:2175:14045:2018 GGGCCAGGGCCCACCCAGGGTCAACTGCAGTCAGTTCATCCGGGGCCAGGCGTGTGTGGCGGAGTGCCGCGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGCGGGGTTGGGGTGGGGCAGAGTACCCCGAGTACCCCTCAAG A00708:783:HHV7GDSXC:2:1576:10655:34131 CCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAACCCCTTACTACA A00708:783:HHV7GDSXC:4:2130:8449:1752 CCAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAACCCCTTACTACA A00708:783:HHV7GDSXC:1:1143:30825:3740 CAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAACCCCTTACTACAC A00708:783:HHV7GDSXC:3:1576:14226:15139 CAGGGCCCACCCAGTGTCAACTGCAGTCAGTTCCTCCGGGGCCAGGAGTGTGTGGAGGAGTGCCGAGTATGGAAGGGGTACGAATTGCAGAGGAGTGGGTGGCTTGAGGGGTACTCGGGGTACTCTGCCCCACCCCAACCCCTTACTACAC

Salome-Brunon avatar May 08 '25 09:05 Salome-Brunon

Ok - now I see what is going on.

The short 20bp reads can be aligned as:

read:      A--GTGTCAACTGCAGTCAG
reference: AGTGTGTCAACTGCAGTCAG

or

read:      --aGTGTCAACTGCAGTCAG (where the little a represents a mismatch)
reference: AGTGTGTCAACTGCAGTCAG

For the first option, CRISPResso's aligner scores this alignment with a match (+5), then a penalty for opening the gap (-20) and one gap extension (-2).

For the second option, because the gap extends beyond the read there is no gap open penalty, so CRISPResso's aligner only scores it as two gap extensions (2*-2) plus a mismatch (-4). For this reason, the second alignment is preferred.

To change this, I suggest increasing the window provided in your region file around all of your guides. If your reads are longer than 50bp you should be able to have 5bp surrounding the guide and be fine (total of ~30bp). Otherwise, you could also try to change the --needleman_wunsch_gap_open or --needleman_wunsch_gap_extend parameters (e.g. set [--needleman_wunsch_gap_extend](https://docs.crispresso.com/suite/core/parameters.html#needleman_wunsch_gap_extend) -20) (you'll probably have to play around a bit to get the results you want).

I hope that helps!

kclem avatar May 08 '25 19:05 kclem