Compatibility of createKEGGdb with keyType option of clusterProfiler::enrichKEGG function
Hello,
Thanks for this useful package!
I have some questions on what exactly is stored in the resulting KEGG.db, and how that relates to the options of clusterProfiler::enrichKEGG.
enrichKEGG has an option keyType, which accepts kegg, ncbi-geneid, ncbi-proteinid or uniprot.
Background/context
I would like to have a solution for doing KEGG enrichment analysis, starting from gene SYMBOL. I want to be able to use the same solution from any arbitrary species.
From this reply https://github.com/YuLab-SMU/clusterProfiler/issues/108#issuecomment-336784558
KEGG id and ENTREZID are the same for only some of the species, but not always the same.
and this blog post https://guangchuangyu.github.io/2016/05/convert-biological-id-with-kegg-api-using-clusterprofiler/
A rule of thumb for the ‘kegg’ ID is entrezgene ID for eukaryote species and Locus ID for prokaryotes.
I conclude that kegg id are not reliable enough/not sufficiently well described for my use. I would thus prefer to use ncbi-geneid.
However, when opening the sqlite database created through createKEGGdb, I only see a field gene_or_orf_id in table pathway2gene.
Questions:
- what is the
gene_or_orf_idpresent in theKEGG.dbdatabase? Is it akeggid? - can I use
createKEGGdbto create aKEGG.dbpackage, and then use it forclusterProfiler::enrichKEGGwithkeyType = ncbi-geneid(anduse_internal_data = TRUE)
Than you in advance for your help, All the best