clusterExperiment icon indicating copy to clipboard operation
clusterExperiment copied to clipboard

Seurat Custom Functions

Open pshukla63 opened this issue 2 years ago • 1 comments

Hello, I want to run clusterExperiment using a custom function. I was wondering if it would be possible to run the Seurat Functions used for clustering? Maybe something like this:

SNN_wrap <- function(inputMatrix, k, pcs = 20, ...) { pca <- RunPCA(inputMatrix) snn <- FindNeighbors([email protected][, 1:pcs]) res <- FindClusters(snn$snn, resolution = k) return(as.numeric(as.character(res[, 1]))) }

SNN <- ClusterFunction(SNN_wrap, inputType = "X", algorithmType = "K", outputType="vector")

Running the individual Seurat Functions (RunPCA, FindNeighbors, FindClusters) with dummy data works. However, the resolution parameter is not exactly "k" - with integer number of clusters. I get errors when trying to run ClusterFunction.

Would this be possible to integrate Seurat Functions in some way into clusterMany?

Thank you!

pshukla63 avatar Jan 29 '24 13:01 pshukla63

I think I got it working:

SeuratFunctions <- function(inputMatrix, k, ...) {
  # Resolutions to iterate over
  resolutions <- seq(0.1, 1.5, 0.05)
  # Have to run PCA each time as this is the only way it works
  res <- RunPCA(inputMatrix, npcs = 40)

  # Next lines adapted from https://hbctraining.github.io/scRNA-seq/lessons/sc_exercises_clustering_analysis.html
  # Determine percent of variation associated with each PC
  pct <- res@stdev / sum(res@stdev) * 100
  # Determine the difference between variation of PC and subsequent PC
  best_dim <- sort(which((pct[1:length(pct) - 1] - pct[2:length(pct)]) > 0.1), decreasing = T)[1] + 1

  res <- [email protected][, 1:best_dim] %>%
    FindNeighbors(.) %>%
    # Extract SNN
    `[[`(2) %>%
    FindClusters(resolution = resolutions[k])
  # Return vector of clusters
  return(as.numeric(as.character(res[, 1])))
}

SeuratPipe <- ClusterFunction(SeuratFunctions,
  inputType = "X", algorithmType = "K",
  outputType = "vector", checkFunctions = FALSE
)

# Running PCA on only variable features - as is done in the Seurat pipeline
varFeats <- VariableFeatures(so)

sce <- SingleCellExperiment(
  list(
    normcounts = as.matrix(GetAssayData(so, assay = "SCT", layer = "scale.data")[varFeats, ]),
    logcounts = as.matrix(GetAssayData(so, assay = "SCT", layer = "data"))[varFeats, ]
  ),
  colData = [email protected]
)

# Using the normalized scaled assay as this is what PCA is run on
ce <- clusterMany(sce,
  clusterFunction = list("SeuratPipe" = SeuratPipe), ncores = 60,
  ks = 1:29, isCount = FALSE, whichAssay = "normcounts"
)
ce <- makeConsensus(ce, proportion = 0.7, clusterLabel = "makeConsensus_0.7")
ce <- makeDendrogram(ce, reduceMethod = "var", nDims = 2000)

# Using logcounts as this is what limma is run on
ce_final <- mergeClusters(ce, mergeMethod = "adjP", DEMethod = "limma", whichAssay = "logcounts", clusterLabel = "mergeClusters", plotInfo = c("adjP"), calculateAll = FALSE)

# Add new clusters to original Seurat Object
so$mergeClusters <- clusterMatrix(ce_final)[, "mergeClusters"]

pshukla63 avatar Feb 05 '24 15:02 pshukla63