Bio-py
Bio-py copied to clipboard
How to know which specific sequences are removed from the "remove_duplicate_seqs" python script ?
Dear author,
Thank you so much for your valuable python script to solve the "duplicate sequences in genome fasta file by IDs or sequences itself". https://github.com/SilentGene/Bio-py/tree/master/remove_duplicate_seqs
I just want to ask you a question in your script: How to know which specific sequences are removed from this script ? I want to have a list of IDs or sequences are filtered in separate file, just in case I want to check the accuracy of filtered sequences.
Thank you so much for clarify my question.