TreeTools icon indicating copy to clipboard operation
TreeTools copied to clipboard

Quality of a dataset

Open ms609 opened this issue 3 years ago • 0 comments

Haag et al. measure the ruggedness of a tree landscape by training a regression model (trained on molecular datasets, implemented in C) based on:

  • Unique topologies after 100 parsimony searches: 42.9 %
  • RF-Distance between parsimony trees: 33.2 %
  • Entropy (Average Shannon entropy per column): 17.0 %
  • Patterns (unique columns)-over-taxa 13.6 %
  • % Gaps 2.5 %
  • Bollback 2.3 %
  • Sites(n columns)-over-taxa 1.5 %
  • % Invariant columns 0.6 %

ms609 avatar Jun 22 '22 08:06 ms609