CellChat icon indicating copy to clipboard operation
CellChat copied to clipboard

Error with netClustering(cellchat, type = "functional")

Open YitengDang opened this issue 4 years ago • 18 comments

Hi, I'm simply trying to run the tutorial script in Rstudio, but am running into the following problem.

When running cellchat <- netClustering(cellchat, type = "functional"), I get the error message:

Screenshot 2021-04-13 at 15 29 06

Initially I thought it might be due to UMAP which wasn't installed, but I installed UMAP and verified that it works. Could you please help me solve this issue? Thanks a lot in advance!

YitengDang avatar Apr 13 '21 13:04 YitengDang

@YitengDang This was an interesting issue. I reinstalled the package from my github, but I did not find any issue. However, you are not the first people mentioned this issue recently. You can try the method in the Pull Request suggested by another user.

sqjin avatar Apr 18 '21 23:04 sqjin

@sqjin I am also having this issue when running the vignette!

Getting the following error:

> cellchat <- computeNetSimilarity(cellchat, type = "functional")
> cellchat <- netEmbedding(cellchat, type = "functional")
Manifold learning of the signaling networks for a single dataset 
C:\Users\colek\AppData\Roaming\Python\Python38\site-packages\umap\umap_.py:132: UserWarning: A large number of your vertices were disconnected from the manifold.
Disconnection_distance = 1 has removed 142 edges.
It has fully disconnected 3 vertices.
You might consider using find_disconnected_points() to find and remove these points from your data.
Use umap.utils.disconnected_vertices() to identify them.
  warn(
> #> Manifold learning of the signaling networks for a single dataset
> cellchat <- netClustering(cellchat, type = "functional")
Classification learning of the signaling networks for a single dataset 
Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)

ColeKeenum avatar Apr 20 '21 19:04 ColeKeenum

So this issue is persisting for me, even after reinstalling the package and running the tutorial. Exactly the same error also arises when I apply the function to my own data. However, interestingly, if I knit the tutorial (using knitr) to create an html file, it does not give any problems and actually outputs a plot. However, if I knit my own data then it still gives the same error. Altogether, this is very puzzling.

YitengDang avatar Apr 22 '21 14:04 YitengDang

@YitengDang @ColeKeenum Can you guys share your cellchat object so that I can replicate the error?

sqjin avatar Apr 22 '21 16:04 sqjin

@sqjin I am inviting you to a private repo which contains my own data in the file "./CellChat/cellchat_Cell-ECM_out_run2.Rds". Let me know if you can access the data. For the tutorial, I checked that loading the data "./tutorial/cellchat_humanSkin_LS.rds" from this repo still gives the same error for me. Thanks a lot in advance.

YitengDang avatar Apr 23 '21 09:04 YitengDang

This has been solved in one of the updates of CellChat somewhere between 1.0.0 and 1.1.3. After pulling the latest version from GitHub (1.1.3), I was able to run the functional clustering part.

YitengDang avatar Sep 13 '21 20:09 YitengDang

Hello, actually I have the same questions and errors as yours, but the version of my CellChat is 1.1.3. So I wonder to know which update do you think is the key to the question. Thank you so much!

zyy-doctor avatar Oct 08 '21 07:10 zyy-doctor

Sorry to reopen this issue, but after some reinstallations and updates I'm encountering the same problem again. There seem to be two separate issues here:

  1. An update in the R package future has caused the parallelization to break down in various parts of the code. The solution is to a. rewrite the ifelse loop as explained in this thread b. replace the deprecated future::plan("multiprocess") by future::plan("multisession") whenever you encounter future::plan. The functions that need to be updated are all functions that use future, including identifyOverExpressedGenes, identifyOverExpressedInteractions, computeCommunProb and netClustering.

  2. An issue with netClustering(cellchat, type = "functional") specifically that has been mentioned in many other issues, e.g. #301, #278, #336 and several others (too many to list). This seems to be related to the fact that the netClustering function cannot deal with NaN values in the UMAPs. Several upstream and downstream functions are affected, so the whole pipeline needs to be examined. a. First we calculate similarities between pathways by running cellchat <- computeNetSimilarity(cellchat, type = "functional"). This generates a similarity matrix stored in cellchat@netP$similarity[['functional']]$matrix. b. The UMAPs are calculated then from the similarity matrices by running cellchat <- netEmbedding(cellchat, type = "functional"). The result is stored in cellchat@netP$similarity[["functional"]]$dr. However, for some unknown reasons this sometimes gives pathways with NaN values for the UMAPs. c. As a result, the netClustering function breaks down because the K-means clustering algorithm invoked in the line idents <- kmeans(data.use, kRange[x], nstart=10)$cluster cannot deal with NaNs.

Altogether, the following temporary patch solves the issue by removing the pathways that have NaNs:

  • In thenetClustering() function, after Y <- methods::slot(object, slot.name)$similarity[[type]]$dr[[comparison.name]] add the following lines:
pathways.ignore <- rownames( Y[rowSums(!is.finite(Y))>0, ] )
cellchat@options$pathways.ignore = pathways.ignore
Y <- Y[!rowSums(!is.finite(Y)),] # filter out rows with NaN, not working downstream
methods::slot(object, slot.name)$similarity[[type]]$dr[[comparison.name]] <- Y
data.use <- Y

This filters out the clusters with NaN values for the UMAPs.

  • In the main code, run netVisual_embedding with option pathway.remove = cellchat@options$pathways.ignore. This removes the pathways with NaNs from the plot to avoid an error.

@sqjin: hopefully this helps in solving these recurrent issues that many users seem to face! This is not a final solution since we just filter out a few pathways that don't work well, but it would be better to directly patch either the UMAP (so it doesn't produce NaNs) or the K-means clustering (so it can deal with NaNs without throwing an error). If I have time I'll try to look into how to solve this.

YitengDang avatar Mar 02 '22 14:03 YitengDang

Thanks for your guidance! Actually,I have solved my question by running it on the server or re running it again!

---- 回复的原邮件 ---- | 发件人 | Yiteng @.> | | 日期 | 2022年03月02日 22:25 | | 收件人 | @.> | | 抄送至 | @.@.> | | 主题 | Re: [sqjin/CellChat] Error with netClustering(cellchat, type = "functional") (#174) |

Reopened #174.

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>

zyy-doctor avatar Mar 04 '22 13:03 zyy-doctor

Hi, Thanks for this Issue and also thanks for this great package. I got the same problem in running netClustering for cellchat_humanSkin_LS.rds. Then I have solved the problem with @YitengDang's last comment, but I still have a problem for netVisual_embedding . This is the problem below:

my code line:

netVisual_embedding(cellchat, type = "functional", pathway.remove = cellchat@options$pathways.ignore, label.size = 3.5)

the problem is described as below:

Error in data.frame(x = Y[, 1], y = Y[, 2], Commun.Prob. = prob_sum/max(prob_sum), : arguments imply differing number of rows: 10, 13 Traceback:

  1. netVisual_embedding(cellchat, type = "functional", pathway.remove = cellchat@options$pathways.ignore, . label.size = 3.5)
  2. data.frame(x = Y[, 1], y = Y[, 2], Commun.Prob. = prob_sum/max(prob_sum), . labels = as.character(unlist(dimnames(prob)[3])), Groups = as.factor(Groups))
  3. stop(gettextf("arguments imply differing number of rows: %s", . paste(unique(nrows), collapse = ", ")), domain = NA)

And I wonder if there is a problem with my cellchat@options$pathways.ignore, it gives: NULL

It may not give a really bad effect, but I do want to solve this problem. Thanks a lot!

xiandxing avatar Mar 10 '22 01:03 xiandxing

Same problem here, I used [YitengDang] code modification to netClustering() and it worked. However, while using netVisual_embedding(cellchat, type = "functional", label.size = 3.5, pathway.remove = cellchat@options$pathways.ignore) I obtain the following error Error in data.frame(x = Y[, 1], y = Y[, 2], Commun.Prob. = prob_sum/max(prob_sum), : arguments imply differing number of rows: 49, 51

And also I obtain NULL while typing cellchat@options$pathways.ignore

Could you please help with this? Thank you very much.

sofiapuvogelvittini avatar Apr 27 '22 14:04 sofiapuvogelvittini

@sofiapuvogelvittini It is so sad to know that this issue has not been well addressed. I am wondering if you would like to share your cellchat object with me and I can test it. Do you have the same issue when running the data in trhe turorial "./tutorial/cellchat_humanSkin_LS.rds"

sqjin avatar Apr 27 '22 22:04 sqjin

Dear sqjin, thank you very much for offering help. I would be happy to share the cellchat object with you so you can test. I have the same issue when running the data in the tutorial. By modifying the code as suggested by [YitengDang] I can run my.netClustering() without problem, however while plotting with

netVisual_embedding(cellchat, type = "functional", label.size = 3.5, pathway.remove = cellchat@options$pathways.ignore) i still have the error: Error in data.frame(x = Y[, 1], y = Y[, 2], Commun.Prob. = prob_sum/max(prob_sum), : arguments imply differing number of rows: 25, 26

and I also obtain NULL in cellchat@options$pathways.ignore. How can I send you the object? All the best, Sof'ia

sofiapuvogelvittini avatar May 10 '22 09:05 sofiapuvogelvittini

@YitengDang Hi, may I know if you have solved your problem now? I also met the same issue.

  1. future::plan("multiprocess", workers = 4) Error: ‘node$session_info$process$pid == pid’ is not TRUE

when I changed the code, it still appeared.

future::plan("multisession", workers = 4) Error: ‘node$session_info$process$pid == pid’ is not TRUE

Interesting, only by this code it worked. And the following "identifyOverExpressedGenes" ran well. future::plan("multisession", workers = 1)

But, if workers more than 1, error appeared again:

future::plan("multisession", workers = 2) Error: ‘node$session_info$process$pid == pid’ is not TRUE future::plan("multisession", workers = 3) Error: ‘node$session_info$process$pid == pid’ is not TRUE future::plan("multisession", workers = 4) Error: ‘node$session_info$process$pid == pid’ is not TRUE

  1. for netClustering(), I changed netClustering <- function(object, slot.name = "netP", type = c("functional","structural"), comparison = NULL, k = NULL, methods = "kmeans", do.plot = TRUE, fig.id = NULL, do.parallel = TRUE, nCores = 1, k.eigen = NULL)

but it reported: Error in storage.mode(x) <- "double" : 'list' object cannot be coerced to type 'double'

when I modified "kmeans(data.frame(data.use),kRange[x],nstart=10)$cluster", it showed "Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) : cannot coerce class ‘"umap"’ to a data.frame "

So, could you help me to solve these problems? Thank you very much!

AIYang1210 avatar May 18 '22 17:05 AIYang1210

@AIYang1210 I suggest you to set future::plan("multisession", workers = 1) when running this function. We will fix it when finding a good solution

sqjin avatar May 29 '22 22:05 sqjin

Sorry to reopen this issue, but after some reinstallations and updates I'm encountering the same problem again. There seem to be two separate issues here:

  1. An update in the R package future has caused the parallelization to break down in various parts of the code. The solution is to a. rewrite the ifelse loop as explained in this thread b. replace the deprecated future::plan("multiprocess") by future::plan("multisession") whenever you encounter future::plan. The functions that need to be updated are all functions that use future, including identifyOverExpressedGenes, identifyOverExpressedInteractions, computeCommunProb and netClustering.
  2. An issue with netClustering(cellchat, type = "functional") specifically that has been mentioned in many other issues, e.g. #301, #278, #336 and several others (too many to list). This seems to be related to the fact that the netClustering function cannot deal with NaN values in the UMAPs. Several upstream and downstream functions are affected, so the whole pipeline needs to be examined. a. First we calculate similarities between pathways by running cellchat <- computeNetSimilarity(cellchat, type = "functional"). This generates a similarity matrix stored in cellchat@netP$similarity[['functional']]$matrix. b. The UMAPs are calculated then from the similarity matrices by running cellchat <- netEmbedding(cellchat, type = "functional"). The result is stored in cellchat@netP$similarity[["functional"]]$dr. However, for some unknown reasons this sometimes gives pathways with NaN values for the UMAPs. c. As a result, the netClustering function breaks down because the K-means clustering algorithm invoked in the line idents <- kmeans(data.use, kRange[x], nstart=10)$cluster cannot deal with NaNs.

Altogether, the following temporary patch solves the issue by removing the pathways that have NaNs:

  • In thenetClustering() function, after Y <- methods::slot(object, slot.name)$similarity[[type]]$dr[[comparison.name]] add the following lines:
pathways.ignore <- rownames( Y[rowSums(!is.finite(Y))>0, ] )
cellchat@options$pathways.ignore = pathways.ignore
Y <- Y[!rowSums(!is.finite(Y)),] # filter out rows with NaN, not working downstream
methods::slot(object, slot.name)$similarity[[type]]$dr[[comparison.name]] <- Y
data.use <- Y

This filters out the clusters with NaN values for the UMAPs.

  • In the main code, run netVisual_embedding with option pathway.remove = cellchat@options$pathways.ignore. This removes the pathways with NaNs from the plot to avoid an error.

@sqjin: hopefully this helps in solving these recurrent issues that many users seem to face! This is not a final solution since we just filter out a few pathways that don't work well, but it would be better to directly patch either the UMAP (so it doesn't produce NaNs) or the K-means clustering (so it can deal with NaNs without throwing an error). If I have time I'll try to look into how to solve this.

I was facing the same types of error (as in having NaNs). I just changed the UMAP background package to uwot and the errors seem to be fixed. I guess this is related to the initial seeds of the umap function. So if you can someway introduce options to set seeds within the netEmbedding function, such errors could be fixed.

Best, Pourya

pouryany avatar Sep 11 '22 02:09 pouryany

I have the same error reported, but it appears in the computeCommunProb function. Is there an exact solution?"

cellchat <- computeCommunProb(cellchat, raw.use = TRUE, type = "truncatedMean", trim = 0.1, distance.use = TRUE, interaction.length = 200, scale.distance = 0.01) truncatedMean is used for calculating the average gene expression per cell group. Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)

maqingyue avatar Mar 20 '23 08:03 maqingyue

Hi, I'm simply trying to run the tutorial script in Rstudio, but am running into the following problem.

When running , I get the error message:cellchat <- netEmbedding(cellchat,slot.name = 'netP',type = "functional") Manifold learning of the signaling networks for a single dataset Error in runUMAP(Similarity, min_dist = min_dist, n_neighbors = n_neighbors, : Cannot find UMAP, please install through pip (e.g. pip install umap-learn or reticulate::py_install(packages = 'umap-learn')).

reticulate::py_install(packages = 'umap-learn') Using virtual environment "C:/Users/18408/Documents/.virtualenvs/r-reticulate" ...

  • "C:/Users/18408/Documents/.virtualenvs/r-reticulate/Scripts/python.exe" -m pip install --upgrade --no-user umap-learn Requirement already satisfied: umap-learn in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (0.5.4) Requirement already satisfied: numpy>=1.17 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from umap-learn) (1.25.2) Requirement already satisfied: scipy>=1.3.1 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from umap-learn) (1.11.3) Requirement already satisfied: scikit-learn>=0.22 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from umap-learn) (1.3.1) Requirement already satisfied: numba>=0.51.2 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from umap-learn) (0.58.0) Requirement already satisfied: pynndescent>=0.5 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from umap-learn) (0.5.10) Requirement already satisfied: tqdm in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from umap-learn) (4.66.1) Requirement already satisfied: llvmlite<0.42,>=0.41.0dev0 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from numba>=0.51.2->umap-learn) (0.41.0) Requirement already satisfied: joblib>=0.11 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from pynndescent>=0.5->umap-learn) (1.3.2) Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from scikit-learn>=0.22->umap-learn) (3.2.0) Requirement already satisfied: colorama in c:\users\18408\documents.virtualenvs\r-reticulate\lib\site-packages (from tqdm->umap-learn) (0.4.6)

cellchat <- netEmbedding(cellchat, type = "functional") Manifold learning of the signaling networks for a single dataset Error in runUMAP(Similarity, min_dist = min_dist, n_neighbors = n_neighbors, : Cannot find UMAP, please install through pip (e.g. pip install umap-learn or reticulate::py_install(packages = 'umap-learn')). Could you please help me solve this issue? Thanks a lot in advance!

double322 avatar Oct 12 '23 11:10 double322