spacexr icon indicating copy to clipboard operation
spacexr copied to clipboard

get_de_genes: At least 90% of genes do not match between the SpatialRNA and Reference objects.

Open nasimbio opened this issue 1 year ago • 3 comments

I am facing this issue for RCTD deconvolution of visium HD data. The visium HD has 18,085 number of genes and single cell ref has 42,496. They overlap for 17,201 genes. The error: Error in get_de_genes(cell_type_info$info, puck.original, fc_thresh = config$fc_cutoff_reg, : get_de_genes: At least 90% of genes do not match between the SpatialRNA and Reference objects. I reduced the size of both to the common genes but didn't solve the issue. I'd appreciate if you help with this issue

nasimbio avatar Jan 29 '25 03:01 nasimbio

Dear nasimbio,

I’m experiencing the same exact issue with Visium HD data. After investigating the create.RCTD and get_de_genes functions, here's what I found:

  1. The get_de_genes function internally filters out genes with fewer than 3 counts across spots, using a MIN_OBS parameter, which seems to be fixed and not modifiable by the user.

  2. It then checks whether the length of that filtered gene vector is less than 0.1 times the number of genes in the scRNA-seq reference.

  3. If this condition is met, it throws an error.

In my case, running the following:

sum(rowSums(spatialRNA@counts) >= 3) < 0.1 * length(rownames(reference@counts))

returns TRUE, confirming that this is the issue triggering the error. When I adjust (inside the code of the get_de_genes function) the parameter MIN_OBS to 2, the function executes correctly.

Could you check if your data, like mine, has too many genes with fewer than 3 counts across the spots? I'm not sure changing the MIN_OBS parameter is ideal, however.

dgcamblor avatar Feb 07 '25 21:02 dgcamblor

You can subset the two counts before run RCTD like: `scCounts <- scCounts[rowSums(scCounts) > 3,] spCounts <- spCounts[rowSums(spCounts) > 3,] co_genes <- intersect(rownames(scCounts),rownames(spCounts)) scCounts <- scCounts[co_genes,] spCounts <- spCounts[co_genes,]

reference <- Reference(scCounts, sc_celltype) query <- SpatialRNA(spCoord, spCounts, colSums(spCounts)) RCTD <- create.RCTD(query, reference, max_cores = ncores, UMI_min = UMI_min) RCTD <- run.RCTD(RCTD, doublet_mode = "doublet")`

tangmindan avatar May 20 '25 01:05 tangmindan

Hi,

I recommend manually checking for concurrence between the gene names in the two datasets. If you want, you can copy and paste the code from get_de_genes and see for yourself. Often, when we see this error, there is something simple such as gene names have been capitalized or appended with another label. If you post more information, we can help you further.

Best, Dylan

dmcable avatar Jul 19 '25 00:07 dmcable