clusterProfiler icon indicating copy to clipboard operation
clusterProfiler copied to clipboard

Issue accessing KEGG data

Open emmanuelgonz opened this issue 2 years ago • 15 comments

I am receiving this error when running enrichKEGG():

Error in download.KEGG.Path(species) : Failed to download KEGG data. Wrong 'species' or the network is unreachable. The 'species' should be one of organisms listed in 'https://www.genome.jp/kegg/catalog/org_list.html'

The code ran just fine yesterday - I noticed the issue just this morning. I have tried running the same code on two different computing environments, and I receive the same error. Below are version numbers:

  • R 4.2.2
  • RStudio 2022.07.1 Build 554
  • clusterProfiler 4.6.2

emmanuelgonz avatar Mar 31 '23 21:03 emmanuelgonz

I also encountered this problem

The following problem occurs when running the script with Rscripts:

Reading KEGG annotation online: "https://rest.kegg.jp/link/mmu/pathway"...
fail to download KEGG data...
Error in download.KEGG.Path(species) :
  Failed to download KEGG data. Wrong 'species' or the network is unreachable. The 'species' should be one of organisms listed in 'https://www.genome.jp/kegg/catalog/org_list.html'
Calls: enrichKEGG ... prepare_KEGG -> download_KEGG -> download.KEGG.Path
In addition: Warning message:
In utils::download.file(url, quiet = TRUE, method = method, ...) :
  URL 'https://rest.kegg.jp/link/mmu/pathway': status was 'Peer certificate cannot be authenticated with given CA certificates'

The code that appears the problem is:

KG <- enrichKEGG(gene = DEe$ENTREZID, keyType = "kegg", organism= "mmu", pvalueCutoff = 0.05, qvalueCutoff = 1)

And the detail of the "DEe" is following:

> head(DEe)
   SYMBOL ENTREZID
1 Ankrd23    78321
2  Il18r1    16182
3  Prkag3   241113
4    Sctr   319229
5    Cd55    13136
6    Fmo2    55990

It worked three weeks ago, but now it doesn't, but sometimes it does work when run alone in R cmd.

Here is my R environment:

> sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.7 LTS

Matrix products: default
BLAS:   /home/luna/Desktop/Software/R.4/R-4.2.0/lib/R/lib/libRblas.so
LAPACK: /home/luna/Desktop/Software/R.4/R-4.2.0/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] stringr_1.5.0             ggplot2_3.4.1
 [3] RColorBrewer_1.1-3        DO.db_2.9
 [5] pathview_1.38.0           Rgraphviz_2.42.0
 [7] topGO_2.50.0              SparseM_1.81
 [9] GO.db_3.16.0              graph_1.76.0
[11] org.Mm.eg.db_3.16.0       AnnotationDbi_1.60.2
[13] IRanges_2.32.0            S4Vectors_0.36.2
[15] Biobase_2.58.0            BiocGenerics_0.44.0
[17] R.utils_2.12.2            R.oo_1.25.0
[19] R.methodsS3_1.8.2         msigdbr_7.5.1
[21] clusterProfiler_4.7.1.003

loaded via a namespace (and not attached):
  [1] fgsea_1.24.0           colorspace_2.1-0       ggtree_3.6.2
  [4] gson_0.1.0             qvalue_2.30.0          XVector_0.38.0
  [7] aplot_0.1.10           farver_2.1.1           remotes_2.4.2
 [10] graphlayouts_0.8.4     ggrepel_0.9.3          bit64_4.0.5
 [13] fansi_1.0.4            scatterpie_0.1.8       codetools_0.2-19
 [16] splines_4.2.0          cachem_1.0.7           GOSemSim_2.24.0
 [19] polyclip_1.10-4        jsonlite_1.8.4         png_0.1-8
 [22] ggforce_0.4.1          compiler_4.2.0         httr_1.4.5
 [25] Matrix_1.5-3           fastmap_1.1.1          lazyeval_0.2.2
 [28] cli_3.6.1              tweenr_2.0.2           tools_4.2.0
 [31] igraph_1.4.1           gtable_0.3.3           glue_1.6.2
 [34] GenomeInfoDbData_1.2.9 reshape2_1.4.4         dplyr_1.1.1
 [37] fastmatch_1.1-3        Rcpp_1.0.10            enrichplot_1.18.3
 [40] vctrs_0.6.1            Biostrings_2.66.0      ape_5.7-1
 [43] babelgene_22.9         nlme_3.1-162           ggraph_2.1.0
 [46] lifecycle_1.0.3        XML_3.99-0.13          DOSE_3.25.0.002
 [49] org.Hs.eg.db_3.16.0    zlibbioc_1.44.0        MASS_7.3-58.3
 [52] scales_1.2.1           tidygraph_1.2.3        KEGGgraph_1.58.3
 [55] parallel_4.2.0         curl_5.0.0             memoise_2.0.1
 [58] gridExtra_2.3          downloader_0.4         ggfun_0.0.9
 [61] HDO.db_0.99.1          yulab.utils_0.0.6      stringi_1.7.12
 [64] RSQLite_2.3.0          tidytree_0.4.2         BiocParallel_1.32.6
 [67] GenomeInfoDb_1.34.9    rlang_1.1.0            pkgconfig_2.0.3
 [70] bitops_1.0-7           matrixStats_0.63.0     lattice_0.20-45
 [73] purrr_1.0.1            treeio_1.22.0          patchwork_1.1.2
 [76] cowplot_1.1.1          shadowtext_0.1.2       bit_4.0.5
 [79] tidyselect_1.2.0       plyr_1.8.8             magrittr_2.0.3
 [82] R6_2.5.1               generics_0.1.3         DBI_1.1.3
 [85] pillar_1.9.0           withr_2.5.0            KEGGREST_1.38.0
 [88] RCurl_1.98-1.12        tibble_3.2.1           crayon_1.5.2
 [91] utf8_1.2.3             viridis_0.6.2          data.table_1.14.8
 [94] blob_1.2.4             digest_0.6.31          tidyr_1.3.0
 [97] gridGraphics_0.5-1     munsell_0.5.0          viridisLite_0.4.1
[100] ggplotify_0.1.0


dulunar avatar Apr 01 '23 07:04 dulunar

It is a network problem, please try again when the network speed is good.

huerqiang avatar Apr 01 '23 07:04 huerqiang

Here I tried, but couldn't find a solution: https://zhuanlan.zhihu.com/p/534214175 code:

library(R.utils) getOption("clusterProfiler.download.method") R.utils::setOption("clusterProfiler.download.method","auto")

error:

In addition: Warning message: In utils::download.file(url, quiet = TRUE, method = method, ...) : URL 'https://rest.kegg.jp/link/osa/pathway': status was 'SSL peer certificate or SSH remote key was not OK' Execution halted

SSL peer certificate or SSH remote key was not OK. The problem may be the KEGG certificate.

Changing the cacheOK = TRUE parameter in utils::download.file() to False might work; But I do not know how to modify this parameter in clusterProfiler; Thanks for your attention.

Method obtained from someone else , a temporary solution to situations that expired SSL certificate: #Unix like operating system library(clusterProfiler) options(clusterProfiler.download.method = 'curl') options(download.file.extra = '-k') #"wget" might work in the same way. options(clusterProfiler.download.method = 'wget') options(download.file.extra = '--no-check-certificate')

liuyinzhe avatar Apr 01 '23 10:04 liuyinzhe

既然你都看到这个教程了,你可以使用options(clusterProfiler.download.method = xx)来尝试下其他下载方方法。KEGG的下载问题并不是第一次出现了,我们测试了不同method的下载速度,提供的默认方法为当时得到的最优方法。随着KEGG的更新,可能别的方法又成为最优了,因此我们才提供了options(clusterProfiler.download.method = xx)的方法来供用户进行选择。在网速很好时,所有方法都是可以跑通的。我建议你使用createKEGGdb来将其本地化。

huerqiang avatar Apr 01 '23 10:04 huerqiang

I am also receiving this error when running enrichKEGG(): before I run enrichKEGG,I first run the code “ R.utils::setOption( "clusterProfiler.download.method",'auto' ) ”

the warning messgae is as follows:

Error in download.KEGG.Path(species) : Failed to download KEGG data. Wrong 'species' or the network is unreachable. The 'species' should be one of organisms listed in 'https://www.genome.jp/kegg/catalog/org_list.html'

R 4.2.3 RStudio 2022.07.1 Build 554 clusterProfiler 4.7.1 from github

I also use

options(clusterProfiler.download.method = xx)

but the warning messages is as follows:

Reading KEGG annotation online: "https://rest.kegg.jp/link/hsa/pathway"... fail to download KEGG data... Error in download.KEGG.Path(species) : Failed to download KEGG data. Wrong 'species' or the network is unreachable. The 'species' should be one of organisms listed in 'https://www.genome.jp/kegg/catalog/org_list.html' In addition: Warning messages: 1: In utils::download.file(url, quiet = TRUE, method = method, ...) : the 'wininet' method is deprecated for http:// and https:// URLs 2: In utils::download.file(url, quiet = TRUE, method = method, ...) : InternetOpenUrl failed: '证书中的日期无效或已过期'

in additon, I also try createKEGGdb:

library(createKEGGdb) species <-c("rno","hsa","mmu") create_kegg_db(species)

but the warning message is as follows:

Reading KEGG annotation online: "https://rest.kegg.jp/list/pathway/rno"... fail to download KEGG data... Reading KEGG annotation online: "https://rest.kegg.jp/list/pathway/hsa"... fail to download KEGG data... Reading KEGG annotation online: "https://rest.kegg.jp/list/pathway/mmu"... fail to download KEGG data... Error in colnames<-(*tmp*, value = c("path_id", "path_name")) : 不能给维度小于二的对象设'colnames'_

muchangqing777 avatar Apr 01 '23 10:04 muchangqing777

@muchangqing777 It may be a new bug, we will fix it soon.

huerqiang avatar Apr 01 '23 11:04 huerqiang

Same problem here.

lucygarner avatar Apr 01 '23 12:04 lucygarner

I don't think it's clusterProfiler related, but KEGG related. It looks like it's because of the expired SSL certificate of rest.kegg.jp

image

when I visit https://rest.kegg.jp/link/ko/pathway, I get

image

Hopefully will be resolved soon.

alimayy avatar Apr 02 '23 00:04 alimayy

@muchangqing777 The error of createKEGGdb should be fixed now, please run again. If you still have issues, please feel free to contact me.

huerqiang avatar Apr 03 '23 18:04 huerqiang

I am receiving this error when running enrichKEGG(): before I run enrichKEGG,I first run the code “ R.utils::setOption( "clusterProfiler.download.method",'auto' ) ”

the warning messgae is as follows: Reading KEGG annotation online: "https://rest.kegg.jp/link/hsa/pathway"... Error in file(con, "r") : cannot open the connection to 'https://rest.kegg.jp/link/hsa/pathway' 此外: Warning message: In file(con, "r") : URL 'https://rest.kegg.jp/link/hsa/pathway': status was 'SSL connect error' I also tried options(clusterProfiler.download.method = xx):xx=auto.... but it dose not work. in additon, I also try createKEGGdb: library(createKEGGdb) species <-c("hsa") create_kegg_db(species) the warning message as follow: Reading KEGG annotation online: "https://rest.kegg.jp/list/pathway/rno"... Error in file(con, "r") : cannot open the connection to 'https://rest.kegg.jp/list/pathway/rno' 此外: Warning message: In file(con, "r") : URL 'https://rest.kegg.jp/list/pathway/rno': status was 'SSL connect error'

sessionInfo() R version 4.3.2 (2023-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 11 x64 (build 22621)

Matrix products: default

locale: [1] LC_COLLATE=Chinese (Simplified)_China.utf8 [2] LC_CTYPE=Chinese (Simplified)_China.utf8
[3] LC_MONETARY=Chinese (Simplified)_China.utf8 [4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_China.utf8

time zone: Asia/Shanghai tzcode source: internal

attached base packages: [1] stats graphics grDevices utils datasets methods
[7] base

other attached packages: [1] createKEGGdb_0.0.3 httr_1.4.7
[3] clusterProfiler_4.10.0 curl_5.2.0

loaded via a namespace (and not attached): [1] IRanges_2.36.0 nnet_7.3-19
[3] goftest_1.2-3 Biostrings_2.70.2
[5] vctrs_0.6.5 spatstat.random_3.2-2
[7] digest_0.6.34 png_0.1-8
[9] shape_1.4.6 ggrepel_0.9.5
[11] deldir_2.0-2 parallelly_1.36.0
[13] MASS_7.3-60.0.1 reshape2_1.4.4
[15] httpuv_1.6.14 foreach_1.5.2
[17] BiocGenerics_0.48.1 qvalue_2.34.0
[19] withr_3.0.0 ggfun_0.1.4
[21] ellipsis_0.3.2 survival_3.5-7
[23] memoise_2.0.1 gson_0.1.0
[25] tidyHeatmap_1.8.1 tidytree_0.4.6
[27] zoo_1.8-12 GlobalOptions_0.1.2
[29] DNAcopy_1.76.0 pbapply_1.7-2
[31] KEGGREST_1.42.0 promises_1.2.1
[33] globals_0.16.2 fitdistrplus_1.1-11
[35] rstudioapi_0.15.0 pan_1.9
[37] miniUI_0.1.1.1 generics_0.1.3
[39] DOSE_3.28.2 S4Vectors_0.40.2
[41] zlibbioc_1.48.0 ggraph_2.1.0
[43] polyclip_1.10-6 GenomeInfoDbData_1.2.11
[45] SparseArray_1.2.3 interactiveDisplayBase_1.40.0 [47] xtable_1.8-4 stringr_1.5.1
[49] doParallel_1.0.17 S4Arrays_1.2.0
[51] BiocFileCache_2.10.1 hms_1.1.3
[53] glmnet_4.1-8 GenomicRanges_1.54.1
[55] irlba_2.3.5.1 colorspace_2.1-0
[57] filelock_1.0.3 ROCR_1.0-11
[59] reticulate_1.35.0 spatstat.data_3.0-4
[61] magrittr_2.0.3 lmtest_0.9-40
[63] readr_2.1.5 later_1.3.2
[65] viridis_0.6.5 ggtree_3.10.0
[67] lattice_0.22-5 spatstat.geom_3.2-8
[69] future.apply_1.11.1 scattermore_1.2
[71] XML_3.99-0.16.1 shadowtext_0.1.3
[73] cowplot_1.1.3 matrixStats_1.2.0
[75] RcppAnnoy_0.0.22 pillar_1.9.0
[77] nlme_3.1-164 iterators_1.0.14
[79] compiler_4.3.2 RSpectra_0.16-1
[81] stringi_1.8.3 UCSCXenaTools_1.4.8
[83] jomo_2.7-6 tensor_1.5
[85] minqa_1.2.6 SummarizedExperiment_1.32.0
[87] dendextend_1.17.1 lubridate_1.9.3
[89] KEGG.db_2.4.5 plyr_1.8.9
[91] crayon_1.5.2 abind_1.4-5
[93] gridGraphics_0.5-1 locfit_1.5-9.8
[95] sp_2.1-3 graphlayouts_1.1.0
[97] bit_4.0.5 dplyr_1.1.4
[99] fastmatch_1.1-4 codetools_0.2-19
[101] GetoptLong_1.0.5 plotly_4.10.4
[103] mime_0.12 splines_4.3.2
[105] circlize_0.4.15 Rcpp_1.0.12
[107] fastDummies_1.7.3 dbplyr_2.4.0
[109] HDO.db_0.99.1 blob_1.2.4
[111] utf8_1.2.4 clue_0.3-65
[113] BiocVersion_3.18.1 lme4_1.1-35.1
[115] fs_1.6.3 listenv_0.9.1
[117] ggplotify_0.1.2 tibble_3.2.1
[119] maftools_2.18.0 Matrix_1.6-5
[121] statmod_1.5.0 tzdb_0.4.0
[123] tweenr_2.0.2 pkgconfig_2.0.3
[125] tools_4.3.2 cachem_1.0.8
[127] RSQLite_2.3.5 viridisLite_0.4.2
[129] DBI_1.2.1 fastmap_1.1.1
[131] scales_1.3.0 grid_4.3.2
[133] ica_1.0-3 Seurat_5.0.1
[135] broom_1.0.5 AnnotationHub_3.10.0
[137] patchwork_1.2.0 BiocManager_1.30.22
[139] dotCall64_1.1-1 RANN_2.6.1
[141] rpart_4.1.23 snow_0.4-4
[143] farver_2.1.1 tidygraph_1.3.1
[145] scatterpie_0.2.1 yaml_2.3.8
[147] MatrixGenerics_1.14.0 cli_3.6.2
[149] purrr_1.0.2 stats4_4.3.2
[151] GEOquery_2.70.0 leiden_0.4.3.1
[153] lifecycle_1.0.4 uwot_0.1.16
[155] Biobase_2.62.0 backports_1.4.1
[157] BiocParallel_1.36.0 timechange_0.3.0
[159] gtable_0.3.4 rjson_0.2.21
[161] ggridges_0.5.6 progressr_0.14.0
[163] parallel_4.3.2 ape_5.7-1
[165] limma_3.58.1 jsonlite_1.8.8
[167] RcppHNSW_0.5.0 mitml_0.4-5
[169] bitops_1.0-7 ggplot2_3.4.4
[171] bit64_4.0.5 Rtsne_0.17
[173] yulab.utils_0.1.4 spatstat.utils_3.0-4
[175] SeuratObject_5.0.1 mice_3.16.0
[177] GOSemSim_2.28.1 lazyeval_0.2.2
[179] shiny_1.8.0 htmltools_0.5.7
[181] enrichplot_1.22.0 GO.db_3.18.0
[183] sctransform_0.4.1 rappdirs_0.3.3
[185] glue_1.7.0 spam_2.10-0
[187] XVector_0.42.0 RCurl_1.98-1.14
[189] treeio_1.26.0 gridExtra_2.3
[191] boot_1.3-28.1 igraph_2.0.1.1
[193] R6_2.5.1 tidyr_1.3.1
[195] DESeq2_1.42.0 SingleCellExperiment_1.24.0
[197] cluster_2.1.6 aplot_0.2.2
[199] GenomeInfoDb_1.38.5 nloptr_2.0.3
[201] DelayedArray_0.28.0 tidyselect_1.2.0
[203] ggforce_0.4.1 xml2_1.3.6
[205] AnnotationDbi_1.64.1 future_1.33.1
[207] munsell_0.5.0 KernSmooth_2.23-22
[209] data.table_1.15.0 htmlwidgets_1.6.4
[211] fgsea_1.28.0 ComplexHeatmap_2.18.0
[213] RColorBrewer_1.1-3 rlang_1.1.3
[215] spatstat.sparse_3.0-3 spatstat.explore_3.2-5
[217] fansi_1.0.6

PanSX-Dr avatar Feb 01 '24 13:02 PanSX-Dr

@PanSX-Dr 我看到你在https://github.com/YuLab-SMU/biomedical-knowledge-mining-book/issues/27也提这个issue了。你当前的R包版本都没问题,应该还是网络环境的原因。

huerqiang avatar Feb 01 '24 23:02 huerqiang