How to know which specific sequences are removed from the "remove_duplicate_seqs" python script ?

Open nguyenvantuananh99 opened this issue 1 year ago • 0 comments

Dear author,

Thank you so much for your valuable python script to solve the "duplicate sequences in genome fasta file by IDs or sequences itself". https://github.com/SilentGene/Bio-py/tree/master/remove_duplicate_seqs

I just want to ask you a question in your script: How to know which specific sequences are removed from this script ? I want to have a list of IDs or sequences are filtered in separate file, just in case I want to check the accuracy of filtered sequences.

Thank you so much for clarify my question.

May 20 '24 07:05 nguyenvantuananh99