TranscriptClean icon indicating copy to clipboard operation
TranscriptClean copied to clipboard

Incompatible with CIGAR operators X/=

Open pdishuck opened this issue 6 years ago • 3 comments

This tool fails for SAM files using X/= CIGAR operators instead of M, which are coming into more common use. It seems to be a quick fix to look for X and = in any place the current code looks for M, but there may be some side effects I'm not aware of.

pdishuck avatar Apr 12 '19 00:04 pdishuck

Thank you, I will keep that in mind for a future version. The X may pose a slight challenge because there is an internal step that merges the MD tag and CIGAR string information, and that step uses X characters to represent mismatches from reference. But in the meantime, a good solution to allow TranscriptClean to run on these files would be to use regex or perhaps a simple script to convert the X/= in the CIGAR string to M.

dewyman avatar Apr 15 '19 21:04 dewyman

Just wanted to chime in that I hope you do add support for X/= cigar operators in the future! In the meantime, would it be possible to add a note to the readme or the wiki about this requirement?

laurenfitch avatar Jun 18 '19 20:06 laurenfitch

No problem- a note has been added to the README. I will work on the compatibility issue.

dewyman avatar Jun 18 '19 20:06 dewyman