clusterProfiler icon indicating copy to clipboard operation
clusterProfiler copied to clipboard

Why does enrichKEGG() output include Category / Subcategory, but in my case gseKEGG() does not?

Open laleoarrow opened this issue 4 months ago • 1 comments

Hi all!

First, thank you for your excellent work on this package!

I observed a discrepancy when using KEGG analysis or I missed somewhere: • enrichKEGG(gene = gene, organism = "hsa", minGSSize = 10, pvalueCutoff = 0.05)returns results with Category and Subcategory columns (i.e. pathway-level classification). • gseKEGG(geneList = geneList, organism = "hsa", minGSSize = 10, pvalueCutoff = 0.05, verbose = FALSE) returns results without those classification columns.

Example output from gseKEGG@result:

               ID                                       Description setSize enrichmentScore       NES       pvalue
hsa00190 hsa00190                         Oxidative phosphorylation      96       0.6717222  3.671845 1.000000e-10
hsa03010 hsa03010                                          Ribosome      71       0.6773091  3.432822 1.000000e-10
hsa04714 hsa04714                                     Thermogenesis     153       0.5008260  2.991333 1.000000e-10
hsa05208 hsa05208 Chemical carcinogenesis - reactive oxygen species     148       0.4989005  2.926987 1.000000e-10

Could you clarify: 1. Is this difference by design (i.e. gseKEGG() currently does not integrate pathway classification)? 2. If so, what is the recommended way to attach Category / Subcategory information to the gseKEGG() results? 3. Would there be plans in future versions to include classification metadata directly in gseKEGG() outputs for consistency?

Thank you for your time and guidance. 🙏

sessionInfo()

R version 4.5.1 (2025-06-13)
Platform: aarch64-apple-darwin20
Running under: macOS Tahoe 26.0.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Asia/Shanghai
tzcode source: internal

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pbmcapply_1.5.1        future.apply_1.20.0    future_1.67.0          vroom_1.6.5            data.table_1.17.8     
 [6] viridis_0.6.5          viridisLite_0.4.2      tidyr_1.3.1            patchwork_1.3.1        dplyr_1.1.4           
[11] Seurat_5.3.0           SeuratObject_5.1.0     sp_2.2-0               leo.basic_0.0.1        leo.sc_0.0.2          
[16] ggrepel_0.9.6          plotly_4.11.0          ggplot2_4.0.0.9000     qs_0.27.3              aPEAR_1.0             
[21] DOSE_4.2.0             org.Hs.eg.db_3.21.0    AnnotationDbi_1.70.0   IRanges_2.42.0         S4Vectors_0.46.0      
[26] Biobase_2.68.0         BiocGenerics_0.54.0    generics_0.1.4         clusterProfiler_4.16.0 devtools_2.4.5        
[31] usethis_3.1.0         

loaded via a namespace (and not attached):
  [1] igraph_2.1.4                graph_1.86.0                ica_1.0-3                   maps_3.4.3                 
  [5] tidyselect_1.2.1            bit_4.6.0                   doParallel_1.0.17           clue_0.3-66                
  [9] lattice_0.22-7              rjson_0.2.23                DoubletFinder_2.0.6         blob_1.2.4                 
 [13] stringr_1.5.1               urlchecker_1.0.1            S4Arrays_1.8.1              dichromat_2.0-0.1          
 [17] png_0.1-8                   cli_3.6.5                   ggplotify_0.1.2             goftest_1.2-3              
 [21] purrr_1.1.0                 BiocNeighbors_2.2.0         uwot_0.2.3                  curl_6.4.0                 
 [25] mime_0.13                   evaluate_1.0.4              tidytree_0.4.6              coin_1.4-3                 
 [29] ComplexHeatmap_2.24.1       stringi_1.8.7               desc_1.4.3                  lubridate_1.9.4            
 [33] httpuv_1.6.16               magrittr_2.0.3              rappdirs_0.3.3              splines_4.5.1              
 [37] mclust_6.1.1                nortest_1.0-4               prodlim_2025.04.28          RApiSerialize_0.1.4        
 [41] ggraph_2.2.1                sctransform_0.4.2           ggbeeswarm_0.7.2            sessioninfo_1.2.3          
 [45] DBI_1.2.3                   Nebulosa_1.0.1              reactome.db_1.92.0          withr_3.0.2                
 [49] class_7.3-23                rprojroot_2.1.0             enrichplot_1.28.4           lmtest_0.9-40              
 [53] ggnewscale_0.5.2            brio_1.1.5                  tidygraph_1.3.1             BiocManager_1.30.26        
 [57] htmlwidgets_1.6.4           fs_1.6.6                    SingleCellExperiment_1.30.1 labeling_0.4.3             
 [61] SparseArray_1.8.1           MatrixGenerics_1.21.0       reticulate_1.43.0           zoo_1.8-14                 
 [65] XVector_0.48.0              knitr_1.50                  UCSC.utils_1.4.0            timechange_0.3.0           
 [69] foreach_1.5.2               grid_4.5.1                  timeDate_4041.110           ggtree_3.16.3              
 [73] R.oo_1.27.1                 bayesbio_1.0.0              RSpectra_0.16-2             irlba_2.3.5.1              
 [77] tester_0.2.0                ggrastr_1.0.2               fastDummies_1.7.5           gridGraphics_0.5-1         
 [81] ellipsis_0.3.2              lazyeval_0.2.2              yaml_2.3.10                 survival_3.8-3             
 [85] scattermore_1.2             ROGUE_1.0                   crayon_1.5.3                RcppAnnoy_0.0.22           
 [89] RColorBrewer_1.1-3          progressr_0.15.1            tweenr_2.0.3                mapproj_1.2.12             
 [93] later_1.4.2                 ggridges_0.5.6              codetools_0.2-20            GlobalOptions_0.1.2        
 [97] profvis_0.4.0               KEGGREST_1.48.1             Rtsne_0.17                  ggpie_0.2.5                
[101] shape_1.4.6.1               ReactomePA_1.52.0           limma_3.64.3                pkgconfig_2.0.3            
[105] spatstat.univar_3.1-4       GenomicRanges_1.60.0        aplot_0.2.8                 spatstat.sparse_3.1-0      
[109] ape_5.8-1                   xtable_1.8-4                highr_0.11                  plyr_1.8.9                 
[113] httr_1.4.7                  tools_4.5.1                 globals_0.18.0              hardhat_1.4.1              
[117] pkgbuild_1.4.8              beeswarm_0.4.0              nlme_3.1-168                crosstalk_1.2.1            
[121] MCL_1.0                     digest_0.6.37               numDeriv_2016.8-1.1         Matrix_1.7-3               
[125] tzdb_0.5.0                  furrr_0.3.1                 farver_2.1.2                reshape2_1.4.4             
[129] Augur_1.0.3                 ks_1.15.1                   yulab.utils_0.2.0           SnowballC_0.7.1            
[133] rpart_4.1.24                glue_1.8.0                  cachem_1.1.0                polyclip_1.10-7            
[137] Biostrings_2.76.0           ggalluvial_0.12.5           mvtnorm_1.3-3               rsample_1.3.1              
[141] presto_1.0.0                parallelly_1.45.1           pkgload_1.4.0               statmod_1.5.0              
[145] RcppHNSW_0.6.0              ScaledMatrix_1.16.0         pbapply_1.7-4               miloR_2.4.1                
[149] fields_16.3.1               SummarizedExperiment_1.38.1 spam_2.11-1                 gson_0.1.0                 
[153] utf8_1.2.6                  gower_1.0.2                 gtools_3.9.5                graphlayouts_1.2.2         
[157] lsa_0.73.3                  gridExtra_2.3               shiny_1.11.1                lava_1.8.1                 
[161] GenomeInfoDbData_1.2.14     R.utils_2.13.0              pals_1.10                   arules_1.7-11              
[165] memoise_2.0.1               scales_1.4.0                R.methodsS3_1.8.2           RANN_2.6.2                 
[169] stringfish_0.17.0           spatstat.data_3.1-6         rstudioapi_0.17.1           cluster_2.1.8.1            
[173] spatstat.utils_3.1-5        fitdistrplus_1.2-4          cowplot_1.2.0               colorspace_2.1-1           
[177] rlang_1.1.6                 GenomeInfoDb_1.44.1         sparseMatrixStats_1.19.0    ipred_0.9-15               
[181] dotCall64_1.2               ggforce_0.5.0               circlize_0.4.16             ggtangle_0.0.7             
[185] xfun_0.52                   pacman_0.5.1                TH.data_1.1-3               remotes_2.5.0              
[189] recipes_1.3.1               iterators_1.0.14            matrixStats_1.5.0           modeltools_0.2-24          
[193] abind_1.4-8                 randomForest_4.7-1.2        GOSemSim_2.34.0             tibble_3.3.0               
[197] libcoin_1.0-10              treeio_1.32.0               ggsci_3.2.0                 ps_1.9.1                   
[201] promises_1.3.3              RSQLite_2.4.2               qvalue_2.40.0               sandwich_3.1-1             
[205] fgsea_1.34.2                DelayedArray_0.34.1         GO.db_3.21.0                compiler_4.5.1             
[209] beachmat_2.24.0             graphite_1.54.0             listenv_0.9.1               Rcpp_1.1.0                 
[213] parsnip_1.3.2               edgeR_4.6.3                 BiocSingular_1.24.0         tensor_1.5.1               
[217] MASS_7.3-65                 BiocParallel_1.42.1         spatstat.random_3.4-1       R6_2.6.1                   
[221] fastmap_1.2.0               multcomp_1.4-28             fastmatch_1.1-6             vipor_0.4.7                
[225] ROCR_1.0-11                 rsvd_1.0.5                  nnet_7.3-20                 gtable_0.3.6               
[229] KernSmooth_2.23-26          miniUI_0.1.2                deldir_2.0-4                htmltools_0.5.8.1          
[233] yardstick_1.3.2             RcppParallel_5.1.10         bit64_4.6.0-1               spatstat.explore_3.5-2     
[237] lifecycle_1.0.4             S7_0.2.0                    processx_3.8.6              callr_3.7.6                
[241] vctrs_0.6.5                 testthat_3.2.3              spatstat.geom_3.5-0         ggfun_0.2.0                
[245] pracma_2.4.4                pillar_1.11.0               locfit_1.5-9.12             jsonlite_2.0.0             
[249] expm_1.0-0                  GetoptLong_1.0.5 

laleoarrow avatar Oct 08 '25 07:10 laleoarrow

My bad — I just found the append_kegg_category() function.

Meanwhile, a quick extended question: are there standardized “category” / “subcategory” (or hierarchical) labels for other commonly used gene sets (e.g. Reactome, MSigDB, GO)? Many thx!

laleoarrow avatar Oct 08 '25 07:10 laleoarrow