Quality scores lost
I recently ran TranscriptClean on my PacBio reads, and it ran fine after making the changes you suggested (thanks!). However I noticed that the quality scores present in the input SAM file are not preserved in the output SAM. Would it be possible to add this feature at some point?
Of course, the quality string would need to be edited correspondingly with whatever indels are corrected. For insertions, the corresponding qual value would be removed, and for deletions you'd need to impute something (maybe "!"?).
If you're OK with a pull request I'd be happy to take a stab at it :)
When I initially developed the code, the PacBio reads I was using lacked a quality score altogether, so it wasn't something I thought a lot about. This seems like it could be a reasonable feature to add- you're welcome to give it a try! I'm working on a TranscriptClean upgrade right now (Python 3 among other things) which I hope to have out by next week, so maybe it would make sense to do a pull after that.
Sounds great, I'll keep an eye out for it!