ete
ete copied to clipboard
TreeKO comparison doesn't handle adjacent duplications
Hello,
I found when calculating the TreeKO distance (PhyloTree.compare() method with has_duplications=True), the method raises an error if duplicate leaves are both children of the same parent node, but not if the leaves are separated by two or more nodes. This seems like a bug to me since it's definitely possible to have adjacently duplicated leaves in a gene tree with orthologs, for example.
import ete4
t1 = ete4.PhyloTree('((A,B),(A,C));')
t2 = ete4.PhyloTree('(A,(B,C));')
print(t1.compare(t2, has_duplications=True)) # works fine
t1 = ete4.PhyloTree('((A,A),(B,C));')
print(t1.compare(t2, has_duplications=True)) # raises 'TreeError: Duplicated items found in target tree.'
Hi @blasks ,
Thanks for reporting this. I personally don't know about the inner workings of compare(), so I'm just going to nudge @jhcepas about it :)