Allow using arbitrary vectors in plotting (not only `obs` columns)
- [x] Additional function parameters / changed functionality / changed defaults?
- [ ] New analysis tool: A simple analysis tool you have been using and are missing in
sc.tools? - [ ] New plotting function: A kind of plot you would like to seein
sc.pl? - [ ] External tools: Do you know an existing package that should go into
sc.external.*? - [ ] Other?
Quite often I need to color UMAPs based on features that are not part of adata.X but adata.obsm for the reason that they are special. E.g. KO data with gRNAs versus endogenes/ target genes, or viral genes versus edogenes.
Example use case:
- Cluster cells based on endogenes
- UMAP and color by a bunch of viral genes
Clustering must not include these viral genes -> must be excluded from X.
I don't want to store so many additional columns in obs and I need to have these features separated in their own matrix for downstream analysis, which is why I want to use obsm.
Can we have sth. like this:
sc.pl.umap(adata, color='viral_genes') # adata.obsm['viral_genes'] is a pandas.DataFrame ?
It shouldn't be overcomplicated I think, since this only involves an additional check: if the elements in the color arg list are not found in obs.columns nor var.columns, then check the keys in obsm and use the entire dataframe behind this key.
This has been worked on here: https://github.com/theislab/anndata/pull/342
The idea is to allow any vector from the anndata object to be used for coloring, but that PR seems a bit stalled at the moment. This would also be useful for providing parameters in other places, like regress_out.