online_iNMF error
Hi All,
Again finding the online branch really helpful in analysis of some very large datasets but got an error today I hope you can help with. As setup for this the object I'm trying to analyze is 11 datasets from 10X imported with Seurat V3. I filter cells based on QC metrics and then convert to liger. All of this is running R3.5.3 and the most up to date version of the online branch.
Everything looks fine until I get to online_iNMF step when this error comes at very end:
obj <- normalize(obj)
obj <- selectGenes(obj, do.plot = T, num.genes = 900)
obj <- scaleNotCenter(obj)
> obj <- online_iNMF(obj, k = 40, lambda = 10, miniBatch_size = 500)
Starting Online iNMF... \|==============================================================================================================\| 100%
Calculate metagene loadings...
Error in [email protected][[i]][1:num_genes, cell_idx] :
subscript out of bounds
This failure comes when in addition to gene and UMI per cell filters I also set a %mito filter to keep any cells with less than 70 mito% (I know it's high but doing that on purpose for this analysis). However, if I impose a more stringent filter to keep anything with less than 50% mito and keep all other settings the same in both Seurat and Liger then it succeeds with no issues.
So problem is obviously somewhere in the filtering but unclear as to why less stringent filtering is causing the issue.
Thoughts?
Thanks! Sam
Hi Sam,
Are you operating on the same set of h5 files (11 in total) with different filters? When you call the normalize(), there will be a slot called cell.data written into those h5 files, which are used in online_iNMF() to provide the number of cells for each dataset. However, if you modify the filter in preprocessing step and the previous cell.data in h5 files might not be overwritten properly, eventually causing the "subscript out of bounds" error. If you generate a new set of h5 files for a new filter, it should address the problem. It would also be helpful to keep a backup set in case h5 files are corrupted/messed during analysis.
Let us know if it works. Thanks!
Best, Chao
Hi Chao,
Sorry should have added in my original comment this is an all in memory dataset as it's reanalysis from large published dataset and the only files they provided were the 10X standard barcodes, features, matrix files.
So I read into Seurat using Seurat Read10X & CreateSeuratObject functionality, perform filtering, then convert Seurat to Liger
obj <- seuratToLiger(objects = obj, combined.seurat = TRUE, meta.var = "sample_id")
Then run the liger commands as above.
When I get the subscript out of bounds error (after trying couple of other things first), I went back to Seurat re-imported the data and used different filtering criteria and then converted to liger and ran again and it succeeded. So always overwriting the entire liger object when I change filtering parameters because run that conversion step after filtering.
Best, Sam
Hello Sam,
Once a Seurat object is converted to a Liger object through seuratToLiger(), the cell.data is generated in the liger object. Hence it is necessary to re-convert if different filters are used to select cells.
Best, Chao
Hi Chao,
Yes, that is what I'm doing. That is how I know that when I filter with keep <50%mito succeeds while filtering with <70%mito fails. They were run separately with identical commands except for the filtering part of Seurat. But for some reason the less stringent filtering results in failure during online_iNMF step.
Best, Sam