samplot icon indicating copy to clipboard operation
samplot copied to clipboard

Inversions based on split reads

Open wbooker opened this issue 4 years ago • 0 comments

Lately I've been trying to detect SV signals in my reads and have been using samplot as a way to verify that my methods are working, and I noticed that some split reads that looked like inversions were not appearing on the plot. Below is in example of a split read that is skipped:

A00434:100:HM75NDMXX:2:1265:1497:27445 147 AaegL5_1 97039982 60 105M45S = 97039883 -204 AAACAAACGCCGATAAGACCCTGATCGACTCGGAACTACAATCTGTTGCGCTTTCTTCACAAACAATGGACCCACAACAGTTGGTGAGGCGCACTGGGAGGGAGCAAGTGCAACACGCTAAGAACTGGAGTCCTCCTAGCTAGTAGGAGG FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFF:FFFFFF SA:Z:AaegL5_1,97041467,+,47M103S,60,0; MC:Z:150M MD:Z:105 RG:Z:HM75NDMXX.2.1101 NM:i:0 AS:i:105 XS:i:0

A00434:100:HM75NDMXX:2:1265:1497:27445 2179 AaegL5_1 97041467 60 47M103H = 97039883 -1585 CCTCCTACTAGCTAGGAGGACTCCAGTTCTTAGCGTGTTGCACTTGC FFFFFF:FFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF SA:Z:AaegL5_1,97039982,-,105M45S,60,0; MC:Z:150M MD:Z:47 RG:Z:HM75NDMXX.2.1101 NM:i:0 AS:i:47 XS:i:0

After looking through the samplot code and testing a change, it looks like this split is being skipped because the query position of the two alignments are the same (0), which is handled at line 1521 in the function get_long_read_plan.

if alignment.query_position in seen:
     continue

I'll admit I am still trying to get my head around using CIGAR strings to get info about split reads and SVs, but I can't find any information to figure out why a split with the same query position should be omitted like this. If you can provide any information that could help me figure this out that would be greatly appreciated!

-Will

wbooker avatar Jan 27 '22 19:01 wbooker