Error writing to connection - Linux/Ubuntu parallelization issue
We are having an error on several brand new Ubuntu servers with everything installed and updated, when we run the code with parallelization. This is the error we get:
Error in serialize(data, node$con, xdr = FALSE) : ignoring SIGPIPE signal Error in serialize(data, node$con, xdr = FALSE) : error writing to connection
Here is a reproducible example code, copied from the tutorial, which triggers the issue with our particular setup:
write.dir <- #please fill here
library(ResistanceGA)
data(resistance_surfaces)
data(samples)
sample.locales <-SpatialPoints(samples[,c(2,3)])
r.stack <-stack(resistance_surfaces$categorical,resistance_surfaces$continuous,resistance_surfaces$feature)
GA.inputs <-GA.prep(ASCII.dir = r.stack,Results.dir = write.dir,method = "LL",max.cat = 500,max.cont = 500,seed = 555,parallel = 4)
gdist.inputs <-gdist.prep(length(sample.locales),samples = sample.locales,method ='commuteDistance')
PARM <-c(1, 250, 75, 1, 3.5, 150, 1, 350)
Resist <-Combine_Surfaces(PARM = PARM,gdist.inputs = gdist.inputs,GA.inputs = GA.inputs,out = NULL,rescale = TRUE)
gdist.response <-Run_gdistance(gdist.inputs = gdist.inputs,r = Resist)
gdist.inputs <-gdist.prep(n.Pops =length(sample.locales),samples = sample.locales,response =as.vector(gdist.response),method ='commuteDistance')
Multi.Surface_optim <-MS_optim(gdist.inputs = gdist.inputs,GA.inputs = GA.inputs)
Session info:
R version 4.0.5 (2021-03-31) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.2 LTS
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] ResistanceGA_4.1-0.46 raster_3.4-10 sp_1.4-5
loaded via a namespace (and not attached): [1] jsonlite_1.7.2 splines_4.0.5 foreach_1.5.1 [4] gtools_3.8.2 shiny_1.6.0 expm_0.999-6 [7] stats4_4.0.5 spatstat.geom_2.1-0 LearnBayes_2.15.1 [10] pillar_1.6.1 lattice_0.20-44 glue_1.4.2 [13] digest_0.6.27 promises_1.2.0.1 polyclip_1.10-0 [16] minqa_1.2.4 colorspace_2.0-1 MuMIn_1.43.17 [19] htmltools_0.5.1.1 httpuv_1.6.1 Matrix_1.3-3 [22] plyr_1.8.6 spatstat.sparse_2.0-0 JuliaCall_0.17.4 [25] pkgconfig_2.0.3 gmodels_2.18.1 purrr_0.3.4 [28] xtable_1.8-4 spatstat.core_2.1-2 scales_1.1.1 [31] gdata_2.18.0 tensor_1.5 XR_0.7.2 [34] later_1.2.0 spatstat.utils_2.1-0 lme4_1.1-27 [37] proxy_0.4-25 tibble_3.1.2 mgcv_1.8-35 [40] generics_0.1.0 ggplot2_3.3.3 ellipsis_0.3.2 [43] XRJulia_0.9.0 cli_2.5.0 magrittr_2.0.1 [46] crayon_1.4.1 mime_0.10 deldir_0.2-10 [49] fansi_0.4.2 doParallel_1.0.16 nlme_3.1-152 [52] MASS_7.3-54 class_7.3-19 tools_4.0.5 [55] lifecycle_1.0.0 munsell_0.5.0 e1071_1.7-6 [58] gdistance_1.3-6 akima_0.6-2.1 compiler_4.0.5 [61] rlang_0.4.11 units_0.7-1 classInt_0.4-3 [64] grid_4.0.5 nloptr_1.2.2.2 iterators_1.0.13 [67] goftest_1.2-2 igraph_1.2.6 miniUI_0.1.1.1 [70] boot_1.3-28 GA_3.2.1 gtable_0.3.0 [73] codetools_0.2-18 abind_1.4-5 DBI_1.1.1 [76] R6_2.5.0 knitr_1.33 dplyr_1.0.6 [79] fastmap_1.1.0 utf8_1.2.1 ggExtra_0.9 [82] spdep_1.1-7 KernSmooth_2.23-20 spatstat.data_2.1-0 [85] parallel_4.0.5 Rcpp_1.0.6 vctrs_0.3.8 [88] sf_0.9-8 rpart_4.1-15 coda_0.19-4 [91] spData_0.3.8 tidyselect_1.1.1 xfun_0.23
We have tried reinstalling everything with different versions, to no avail. We have a very large RAM on both servers. A simple parallelization with doParallel works:
library(doParallel)
getPrimeNumbers <- function(n) {
n <- as.integer(n)
if(n > 1e6) stop("n too large")
primes <- rep(TRUE, n)
primes[1] <- FALSE
last.prime <- 2L
for(i in last.prime:floor(sqrt(n)))
{
primes[seq.int(2L*last.prime, n, last.prime)] <- FALSE
last.prime <- last.prime + min(which(primes[(last.prime+1):n]))
}
which(primes)
}
no_cores <- detectCores() - 1
registerDoParallel(cores=no_cores)
cl <- makeCluster(no_cores, type="FORK")
result <- parLapply(cl, 10:10000, getPrimeNumbers)
stopCluster(cl)
It doesn’t look like you’ve specified the full path to the directory where you want results written.
- Bill - On May 24, 2021, 05:14 -0400, Julian WITTISCHE @.***>, wrote:
We are having an error on several brand new Ubuntu servers with everything installed and updated, when we run the code with parallelization. This is the error we get: Error in serialize(data, node$con, xdr = FALSE) : ignoring SIGPIPE signal Error in serialize(data, node$con, xdr = FALSE) : error writing to connection Here is a reproducible example code, copied from the tutorial, which triggers the issue with our particular setup: write.dir <- #please fill here library(ResistanceGA) data(resistance_surfaces) data(samples) sample.locales <-SpatialPoints(samples[,c(2,3)]) r.stack <-stack(resistance_surfaces$categorical,resistance_surfaces$continuous,resistance_surfaces$feature) GA.inputs <-GA.prep(ASCII.dir = r.stack,Results.dir = write.dir,method = "LL",max.cat = 500,max.cont = 500,seed = 555,parallel = 4) gdist.inputs <-gdist.prep(length(sample.locales),samples = sample.locales,method ='commuteDistance') PARM <-c(1, 250, 75, 1, 3.5, 150, 1, 350) Resist <-Combine_Surfaces(PARM = PARM,gdist.inputs = gdist.inputs,GA.inputs = GA.inputs,out = NULL,rescale = TRUE) gdist.response <-Run_gdistance(gdist.inputs = gdist.inputs,r = Resist) gdist.inputs <-gdist.prep(n.Pops =length(sample.locales),samples = sample.locales,response =as.vector(gdist.response),method ='commuteDistance') Multi.Surface_optim <-MS_optim(gdist.inputs = gdist.inputs,GA.inputs = GA.inputs) Session info: R version 4.0.5 (2021-03-31) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.2 LTS attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ResistanceGA_4.1-0.46 raster_3.4-10 sp_1.4-5 loaded via a namespace (and not attached): [1] jsonlite_1.7.2 splines_4.0.5 foreach_1.5.1 [4] gtools_3.8.2 shiny_1.6.0 expm_0.999-6 [7] stats4_4.0.5 spatstat.geom_2.1-0 LearnBayes_2.15.1 [10] pillar_1.6.1 lattice_0.20-44 glue_1.4.2 [13] digest_0.6.27 promises_1.2.0.1 polyclip_1.10-0 [16] minqa_1.2.4 colorspace_2.0-1 MuMIn_1.43.17 [19] htmltools_0.5.1.1 httpuv_1.6.1 Matrix_1.3-3 [22] plyr_1.8.6 spatstat.sparse_2.0-0 JuliaCall_0.17.4 [25] pkgconfig_2.0.3 gmodels_2.18.1 purrr_0.3.4 [28] xtable_1.8-4 spatstat.core_2.1-2 scales_1.1.1 [31] gdata_2.18.0 tensor_1.5 XR_0.7.2 [34] later_1.2.0 spatstat.utils_2.1-0 lme4_1.1-27 [37] proxy_0.4-25 tibble_3.1.2 mgcv_1.8-35 [40] generics_0.1.0 ggplot2_3.3.3 ellipsis_0.3.2 [43] XRJulia_0.9.0 cli_2.5.0 magrittr_2.0.1 [46] crayon_1.4.1 mime_0.10 deldir_0.2-10 [49] fansi_0.4.2 doParallel_1.0.16 nlme_3.1-152 [52] MASS_7.3-54 class_7.3-19 tools_4.0.5 [55] lifecycle_1.0.0 munsell_0.5.0 e1071_1.7-6 [58] gdistance_1.3-6 akima_0.6-2.1 compiler_4.0.5 [61] rlang_0.4.11 units_0.7-1 classInt_0.4-3 [64] grid_4.0.5 nloptr_1.2.2.2 iterators_1.0.13 [67] goftest_1.2-2 igraph_1.2.6 miniUI_0.1.1.1 [70] boot_1.3-28 GA_3.2.1 gtable_0.3.0 [73] codetools_0.2-18 abind_1.4-5 DBI_1.1.1 [76] R6_2.5.0 knitr_1.33 dplyr_1.0.6 [79] fastmap_1.1.0 utf8_1.2.1 ggExtra_0.9 [82] spdep_1.1-7 KernSmooth_2.23-20 spatstat.data_2.1-0 [85] parallel_4.0.5 Rcpp_1.0.6 vctrs_0.3.8 [88] sf_0.9-8 rpart_4.1-15 coda_0.19-4 [91] spData_0.3.8 tidyselect_1.1.1 xfun_0.23 We have tried reinstalling everything with different versions, to no avail. We have a very large RAM on both servers. A simple parallelization with doParallel works: library(doParallel) getPrimeNumbers <- function(n) { n <- as.integer(n) if(n > 1e6) stop("n too large") primes <- rep(TRUE, n) primes[1] <- FALSE last.prime <- 2L for(i in last.prime:floor(sqrt(n))) { primes[seq.int(2L*last.prime, n, last.prime)] <- FALSE last.prime <- last.prime + min(which(primes[(last.prime+1):n])) } which(primes) } no_cores <- detectCores() - 1 registerDoParallel(cores=no_cores) cl <- makeCluster(no_cores, type="FORK") result <- parLapply(cl, 10:10000, getPrimeNumbers) stopCluster(cl) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
I did store an appropriate path in the write.dir object, I simply got rid of it if someone wanted to run it. The script works in Windows.
Hello - I am receiving the same error, also using a server with ubuntu
Did you find a fix for this @julian-wittische? - thanks I have googled the general error and some seem to think it is the amount of memory it will use? However I am running it on a server with many CPUs so I dont think this should be a problem
Hello,
Unfortunately, we have been unable to solve this issue. The memory is not the issue in our case because even using only 2 CPU with huge RAM triggers the error.
Unfortunately I do not have convenient access to an Ubuntu/Linux machine to troubleshoot this issue. Have you confirmed that you can write results to your specified directory when running a simple example?
On Mon, May 31, 2021 at 4:18 AM Julian WITTISCHE @.***> wrote:
Hello,
Unfortunately, we have been unable to solve this issue. The memory is not the issue in our case because even using only 2 CPU with huge RAM triggers the error.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/wpeterman/ResistanceGA/issues/17#issuecomment-851305531, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDLQUIGLR2N444INFV2H4LTQNA6VANCNFSM45M3MEPA .
--
-Bill-
Yes everything writes to the correct and specified directory when I run a small example without the parrallel setting.
As I can not seem to solve the parallel issue. Is it possible to run all the rasters for SS_optim seperatley (i.e. on seperate cores) and then concatenate the results for the pseudo bootsrapping method? I am running it for a big area so am trying to find any way to speed up the process.
Or can we use doParallel around the function itself somehow? Sorry I am very new to using parallel in R etc.
Thank you
Running in parallel reduces the time to optimize a single surface. It is entirely possible to optimize each surface without parallelization, but this will be extremely time consuming.
On Fri, Jun 18, 2021 at 5:20 AM EveTC @.***> wrote:
As I can not seem to solve the parallel issue. Is it possible to run all the rasters for SS_optim seperatley and then concatenate the results for the pseudo bootsrapping method? I am running it for a big area so am trying to find any way to speed up the process. Thank you
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/wpeterman/ResistanceGA/issues/17#issuecomment-863893968, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDLQUMLZ5ZDKUQQJZZPHTDTTMFVDANCNFSM45M3MEPA .
--
-Bill-
Ok thanks @wpeterman. I have found this chain (https://github.com/luca-scr/GA/issues/50 - see answer to the issue at the end) on the GA GitHub (@julian-wittische) which may help us debug the issue?
I have never used parallel in R but once I set
cl <- makePSOCKcluster(8) # I defined cl by this commend
registerDoParallel(cl)
I no longer get the previous error but then I do not recieve the normal iteration ouptut so I am unsure if it is working.
I am going to play with this today and let you know how it goes, but if you have any success with this way forward - please let me know.
I believe it now works for me. I run the code below:
library(parallel)
library(doParallel)
cl <- makePSOCKcluster(32)
registerDoParallel(cl)
# Set variables for ResistanceGA
GA.inputs_All <- GA.prep(method="AIC", ASCII.dir=raster, Results.dir = write.dir, min.cat=1, seed=111, parallel=cl)
# Inputs for resistance method
gdist.inputs <- gdist.prep(length(sample.sp), samples=sample.sp, response= lower(fst), method='costDistance')
# Export info to cluster
clusterExport(cl=cl,varlist=c("GA.inputs_All","gdist.inputs","raster","sample.sp","fst")) # list everything you call in ro GA.inputs and gdist
clusterEvalQ(cl=cl, .libPaths("/R")) # set path to where your R library is
clusterCall(cl=cl, library, package = "ResistanceGA", character.only = TRUE)
# Run SS_optim
run1_SSoptim <- SS_optim(gdist.inputs = gdist.inputs, GA.inputs = GA.inputs_All, diagnostic_plots=FALSE)
# Stop cluster once it has finished
stopCluster(cl)
Has this issue been officially resolved? I'm running into the same output errors when I run my code in an ubuntu EC2 instance.
Error in unserialize(socklist[[n]]) : error reading from connection
This was an idiosyncratic error that I could never recreate on clusters or computers I had access to. If you're receiving an error when running ResistanceGA with Julia, try following suggestion from Julian and Eve above.