bio icon indicating copy to clipboard operation
bio copied to clipboard

Order & Orientation of Sequences matters for DeNovo Assembler

Open robinemig opened this issue 5 years ago • 0 comments

I've seen a situation pop up numerous times where the length of the final consensus sequence when calling the following code, changes, if the sequences are ordered from longest to shortest and vice versa, or if the orientation of the sequences changes Bio.Algorithms.Assembly.OverlapDeNovoAssembler assem = new Bio.Algorithms.Assembly.OverlapDeNovoAssembler(); assem.OverlapAlgorithm.GapOpenCost = -10; assem.OverlapAlgorithm.GapExtensionCost = -2; assem.OverlapAlgorithm.SimilarityMatrix = new SimilarityMatrix(SimilarityMatrix.StandardSimilarityMatrix.AmbiguousDna); var assembly = assem.Assemble(reads) as Bio.Algorithms.Assembly.OverlapDeNovoAssembly;

compare by assembly.Contigs.First().Consensus.Count Ive tried to make some simulated data to provide a test case, but can't seem to find one that works. but I can verify it does this with as little as two sequences

robinemig avatar Apr 26 '20 18:04 robinemig