Running NICHES on large spatial datasets
I have encountered an interesting bug, I think related to how we are calculating edgelists in spatial data.
If I RunNICHES on imputed spatial data of reasonable size, I get the following:
and the loop quits after i=1
If I RunNICHES on the non-imputed but same spatial data, the loop gets through i=3 and then quits, and gives the following:
Thoughts?
I think that
- we probably need to update how we compute edgelists
- something about more vs. less sparse data is having an effect here... why? S
- Bug #37 part A above is a data-handling bug internal to NICHES, and can be by-passed, for now, by limiting the input object to just genes in your ground truth, i.e:
fantom.features <- unique(c(NICHES::ncomms8866_mouse$Ligand.ApprovedSymbol, NICHES::ncomms8866_mouse$Receptor.ApprovedSymbol)) temp <- data.list[[i]] # copy input object to temp temp <- temp[fantom.features,] #limit to features necessary for NICHES niches.output <- RunNICHES(object = temp, ... # etc as per normal workflows
Bug #37 part B above is a edge-list efficiency/scaling bug that is fixed by AOZ in the 'msbr_explorations' branch currently
Release v.1.2.0 fixes this issue