Data points do not match dendogram structure
Hi, Thanks for developing this package!
I have been testing it in a hematopoiesis dataset and every time I get to the dendrogram, some cells do not overlap certain branches and the root seems to be off.
and
Is this an issue with the dendogram function or rather my data?
This is how the tree and milestones look like:
Additionally, is there any recommendations on how to perform the analysis when we want to compare 2 conditions (i.e. a control and a knockout)? Is it suggested to define the milestones together or rather to perform the analysis separately by conditions? I want to define what are the genes that change across cell fates in the context of the perturbation.
Hi @mcortes-lopez, thanks for reaching out! It seems like the issue could be due to a some intermediate spurious segments empty of cells, which is a situation scf.tl.dendrogram do not know how to handle (and likely other downstream functions).
I would suggest to clean up the tree or change parameters of ppt before constructing the dendrogram. On way to check empty segments is the following: len(adata.obs.seg.cat.categories)==len(adata.uns["graph"]["pp_seg"].n).
Concerning comparing conditions, I would suggest learning a common trajectory for both conditions. However, scfates has very limited multi-condition testing that only applies to single non-bifurcating trajectories. To that end I am currently working on a completely novel framework that specifically does the task of learning context differences in fate hierarchies. It is still a work in progress so stay tuned in the coming months!
Hi @LouisFaure! thank you very much for the explanation. I understood the source of the problem. adata.uns["graph"]["pp_seg"] has one more seg than adata.obs.seg.cat.categories has no cells get assigned to each. However, once the problem is identified, I am not sure on how to fix it. Can you please clarify this? Thank you