Describe Limitations for UMAP Plot
@MercifulCode Are there limitations to the annotations that are displayed in the UMAP plot? I think HUMAN does not appear for these example files:
https://github.com/dfci/alignmentviewer/tree/master/example
I think the example annotation file is a bit out of date - there should not be a header, and the last character for each sequence name is missing.
That said, after digging some more into the bioblocks-viz code, I found that indeed there is a limitation to the number of annotations displayed!
Internally, UMAPSequenceContainerClass sets the colors associated with each label using this a predefined array of colors. If there are more labels than colors, the remaining labels are classified as "unannotated."
Going forward, I see a few possible solutions:
- Allowing the set of colors to use to be passed to the component as well.
- A flag to re-use colors if the limit is hit.
- In the case of too many labels, renaming "Unnannotated" to something better.
- Paginating the labels - for example, if there's 40 unique labels but only 10 colors, we'd have 4 pages of labels.
I'm leaning towards both options 1 & 2, as that way all labels can be viewable at once. That said, if we increase the number of labels to display, the labels would need to be placed inside a scrollable list to prevent overflow.
What do you think?
@MercifulCode we talked a bit yesterday. Some of the suggestions were:
- Scroll to see more annotation labels
- Use color/shape combinations to have more possible options of annotations
- Allow developers, to pass SVG or PNGs pointers to customize the annotation display themselves