Incompatible with CIGAR operators X/=
This tool fails for SAM files using X/= CIGAR operators instead of M, which are coming into more common use. It seems to be a quick fix to look for X and = in any place the current code looks for M, but there may be some side effects I'm not aware of.
Thank you, I will keep that in mind for a future version. The X may pose a slight challenge because there is an internal step that merges the MD tag and CIGAR string information, and that step uses X characters to represent mismatches from reference. But in the meantime, a good solution to allow TranscriptClean to run on these files would be to use regex or perhaps a simple script to convert the X/= in the CIGAR string to M.
Just wanted to chime in that I hope you do add support for X/= cigar operators in the future! In the meantime, would it be possible to add a note to the readme or the wiki about this requirement?
No problem- a note has been added to the README. I will work on the compatibility issue.