MicrobiotaProcess
MicrobiotaProcess copied to clipboard
updated mp_import_humann_regroup to keep the abundance of contributed taxa
introduced keep.contribute.abundance argument in mp_import_humann_regroup() https://github.com/YuLab-SMU/MicrobiotaProcess/commit/6ccd9813547b4c03bd359cb784c771d5d75728ea.
The default
keep.contribute.abundance = FALSE
only the taxa information was kept.
> library(MicrobiotaProcess)
MicrobiotaProcess v1.13.2.992 For help:
https://github.com/YuLab-SMU/MicrobiotaProcess/issues
If you use MicrobiotaProcess in published research, please cite the
paper:
Shuangbin Xu, Li Zhan, Wenli Tang, Qianwen Wang, Zehan Dai, Lang Zhou,
Tingze Feng, Meijun Chen, Tianzhi Wu, Erqiang Hu, Guangchuang Yu.
MicrobiotaProcess: A comprehensive R package for deep mining
microbiome. The Innovation. 2023, 4(2):100388. doi:
10.1016/j.xinn.2023.100388
Export the citation to BibTex by citation('MicrobiotaProcess')
This message can be suppressed by:
suppressPackageStartupMessages(library(MicrobiotaProcess))
Attaching package: ‘MicrobiotaProcess’
The following object is masked from ‘package:stats’:
filter
> mpse.ko1 <- mp_import_humann_regroup('./QJ.humann3_ko.tsv', './SRP190865_meta.csv')
> mpse.ko1
# A MPSE-tibble (MPSE object) abstraction: 498,387 × 6
# OTU=5359 | Samples=93 | Assays=Abundance | Taxonomy=NULL
OTU Sample Abundance geo_loc_name_country Group contribute.taxa
<chr> <chr> <dbl> <chr> <chr> <list>
1 K00001 SRR8849198 0 China PCOS <tibble [8 × 1]>
2 K00002 SRR8849198 0 China PCOS <tibble [3 × 1]>
3 K00003 SRR8849198 55.1 China PCOS <tibble [29 × 1]>
4 K00004 SRR8849198 0 China PCOS <tibble [3 × 1]>
5 K00005 SRR8849198 83.0 China PCOS <tibble [24 × 1]>
6 K00007 SRR8849198 0 China PCOS <tibble [1 × 1]>
7 K00008 SRR8849198 0 China PCOS <tibble [6 × 1]>
8 K00009 SRR8849198 39.4 China PCOS <tibble [23 × 1]>
9 K00010 SRR8849198 0 China PCOS <tibble [16 × 1]>
10 K00012 SRR8849198 1878. China PCOS <tibble [27 × 1]>
# ℹ 498,377 more rows
# ℹ Use `print(n = ...)` to see more rows
> mpse.ko1 %>% mp_extract_feature() %>% tidyr::unnest(contribute.taxa)
# A tibble: 85,919 × 2
OTU contribute.taxa
<chr> <chr>
1 K00001 s__Bifidobacterium_bifidum
2 K00001 s__Bifidobacterium_longum
3 K00001 s__Eggerthella_lenta
4 K00001 s__Enterobacter_cloacae_complex
5 K00001 s__Klebsiella_pneumoniae
6 K00001 s__Lactobacillus_gasseri
7 K00001 s__Lactobacillus_paragasseri
8 K00001 s__Megasphaera_elsdenii
9 K00002 s__Blautia_obeum
10 K00002 s__Blautia_producta
# ℹ 85,909 more rows
# ℹ Use `print(n = ...)` to see more rows
keep.contribute.abundance=TRUE
the abundance of each contributed taxa in each sample will be kept, and they can be extract with mp_extract_feature.
> mpse.ko2 <- mp_import_humann_regroup('./QJ.humann3_ko.tsv', './SRP190865_meta.csv', keep.contribute.abundance=T)
> mpse.ko2
# A MPSE-tibble (MPSE object) abstraction: 498,387 × 6
# OTU=5359 | Samples=93 | Assays=Abundance | Taxonomy=NULL
OTU Sample Abundance geo_loc_name_country Group contribute.taxa
<chr> <chr> <dbl> <chr> <chr> <list>
1 K00001 SRR8849198 0 China PCOS <tibble [8 × 94]>
2 K00002 SRR8849198 0 China PCOS <tibble [3 × 94]>
3 K00003 SRR8849198 55.1 China PCOS <tibble [29 × 94]>
4 K00004 SRR8849198 0 China PCOS <tibble [3 × 94]>
5 K00005 SRR8849198 83.0 China PCOS <tibble [24 × 94]>
6 K00007 SRR8849198 0 China PCOS <tibble [1 × 94]>
7 K00008 SRR8849198 0 China PCOS <tibble [6 × 94]>
8 K00009 SRR8849198 39.4 China PCOS <tibble [23 × 94]>
9 K00010 SRR8849198 0 China PCOS <tibble [16 × 94]>
10 K00012 SRR8849198 1878. China PCOS <tibble [27 × 94]>
# ℹ 498,377 more rows
# ℹ Use `print(n = ...)` to see more rows
> mpse.ko2 %>% mp_extract_feature()
# A tibble: 5,359 × 2
OTU contribute.taxa
<chr> <list>
1 K00001 <tibble [8 × 94]>
2 K00002 <tibble [3 × 94]>
3 K00003 <tibble [29 × 94]>
4 K00004 <tibble [3 × 94]>
5 K00005 <tibble [24 × 94]>
6 K00007 <tibble [1 × 94]>
7 K00008 <tibble [6 × 94]>
8 K00009 <tibble [23 × 94]>
9 K00010 <tibble [16 × 94]>
10 K00012 <tibble [27 × 94]>
# ℹ 5,349 more rows
# ℹ Use `print(n = ...)` to see more rows
> mpse.ko2 %>% mp_extract_feature() %>% tidyr::unnest(contribute.taxa)
# A tibble: 85,421 × 95
OTU contribute.taxa SRR8849198 SRR8849199 SRR8849200 SRR8849201 SRR8849202
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 K00001 s__Bifidobacte… 0 0 0 3.77 0
2 K00001 s__Bifidobacte… 0 0 0 0 0
3 K00001 s__Eggerthella… 0 0 0 0 0
4 K00001 s__Enterobacte… 0 0 0 0 0
5 K00001 s__Klebsiella_… 0 0 0 0 0
6 K00001 s__Lactobacill… 0 0 0 0 0
7 K00001 s__Lactobacill… 0 0 0 0 0
8 K00001 s__Megasphaera… 0 0 0 66.0 0
9 K00002 s__Blautia_obe… 0 0 0 0 0
10 K00002 s__Blautia_pro… 0 0 0 0 0
# ℹ 85,411 more rows
# ℹ 88 more variables: SRR8849203 <dbl>, SRR8849204 <dbl>, SRR8849205 <dbl>,
# SRR8849206 <dbl>, SRR8849207 <dbl>, SRR8849208 <dbl>, SRR8849209 <dbl>,
# SRR8849210 <dbl>, SRR8849211 <dbl>, SRR8849212 <dbl>, SRR8849213 <dbl>,
# SRR8849214 <dbl>, SRR8849215 <dbl>, SRR8849216 <dbl>, SRR8849217 <dbl>,
# SRR8849218 <dbl>, SRR8849219 <dbl>, SRR8849220 <dbl>, SRR8849221 <dbl>,
# SRR8849222 <dbl>, SRR8849223 <dbl>, SRR8849224 <dbl>, SRR8849225 <dbl>, …
# ℹ Use `print(n = ...)` to see more rows
the gene abundance of specified taxa can be extracted quickly and converted to MPSE. For example, the following codes will extract the gene abundance of Bifidobacterium, then re-calculate the total specified gene abundance according to the abundance of each contributed taxa, and generated a new MPSE object.
> mpse.ko2 %>% mp_extract_feature() %>% tidyr::unnest(contribute.taxa) %>% dplyr::filter(grepl('s__Bifidobact', contribute.taxa)) %>% dplyr::select(-contribute.taxa) %>% dplyr::group_by(OTU) %>% dplyr::summarize(dplyr::across(dplyr::everything(),sum)) %>% tibble::column_to_rownames(var='OTU') %>% MPSE() %>% dplyr::left_join(mpse.ko2 %>% mp_extract_sample())
# A MPSE-tibble (MPSE object) abstraction: 82,398 × 5
# OTU=886 | Samples=93 | Assays=Abundance | Taxonomy=NULL
OTU Sample Abundance geo_loc_name_country Group
<chr> <chr> <dbl> <chr> <chr>
1 K00001 SRR8849198 0 China PCOS
2 K00012 SRR8849198 8.03 China PCOS
3 K00013 SRR8849198 47.4 China PCOS
4 K00016 SRR8849198 51.8 China PCOS
5 K00031 SRR8849198 0 China PCOS
6 K00052 SRR8849198 40.5 China PCOS
7 K00053 SRR8849198 146. China PCOS
8 K00057 SRR8849198 0 China PCOS
9 K00058 SRR8849198 27.3 China PCOS
10 K00059 SRR8849198 5.59 China PCOS
# ℹ 82,388 more rows
# ℹ Use `print(n = ...)` to see more rows