maftools icon indicating copy to clipboard operation
maftools copied to clipboard

Annovar to MAF question

Open AsanchezMD opened this issue 5 years ago • 7 comments

Good Morning,

I am relative new to the world of bioinformatics, I have been trying to convert annovar annotated files to maf, however it seems that there is problem in maf files to correctly identify INDELs.

This is my code: T_HG1 <- "T_HG1.txt" T_HG1.maf <- annovarToMaf(annovar = T_HG1, refBuild = 'hg38' , table = 'refGene') HG_1 <- read.maf(maf =T_HG1.maf, useAll = T, verbose = T, clinicalData = laml.clin)

I get the following messages:

-Validating --Found 16 variants with no Gene Symbols --Annotating them as 'UnknownGene' for convenience --Non MAF specific values in Variant_Classification column: NA Unknown --Non MAF specific values in Variant_Type column: NA MNP -Silent variants: 278 -Summarizing -Processing clinical data -Finished in 0.054s elapsed (0.049s cpu)

Then, when I do plotmafSummary(maf = HG_1, rmOutlier = TRUE, addStat = 'median')

I get the following warning error: Warning message: In titv(maf = maf, useSyn = TRUE, plot = FALSE) : Non standard Ti/Tv class: 7TRUE

Those non standard Ti/Tv are INDELs, I think there is a problem on annovartomaf identifying them as INDELs (The reason I know those are INDELs is because the company that did the bioinformatics analysis gave me the VCF annotated files and the summary plot using maftoolsn-but unfortunately, not the code-)

Is there something I am doing wrongly while importing the files? Your feedback is greatly appreciated, thanks!

Alex

AsanchezMD avatar Feb 26 '21 16:02 AsanchezMD

Hi, Thanks for reporting. Could you please post your sessionInfo and also if possible ref and alt alleles for the problematic INDELs?

PoisonAlien avatar Feb 27 '21 08:02 PoisonAlien

Hello, Thanks for the response, I am not sure how to identify the problematic INDELs, I tried doing the following: mt <- titv(maf = HG_1) mt$fraction.contribution

and I got the following:

mt = titv(maf = HG_1) Warning message: In titv(maf = HG_1) : Non standard Ti/Tv class: 1TRUE mt$fraction.contribution Tumor_Sample_Barcode C>A C>G C>T T>C T>A T>G 1: T_HG1 9.693878 14.79592 39.28571 18.36735 6.632653 11.22449

I am very confused, since when using the TiTv function I get "Non standard Ti/Tv class: 1TRUE" but when using the plotmafSummary , I get "Non standard Ti/Tv class: 7TRUE"

This is the session info:

R version 4.0.4 (2021-02-15) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur 10.16

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: character(0)

other attached packages: [1] maftools_2.6.05

loaded via a namespace (and not attached): [1] compiler_4.0.4 Matrix_1.3-2 graphics_4.0.4 tools_4.0.4 RColorBrewer_1.1-2 [6] survival_3.2-7 utils_4.0.4 yaml_2.2.1 grDevices_4.0.4 stats_4.0.4
[11] datasets_4.0.4 splines_4.0.4 grid_4.0.4 data.table_1.14.0 methods_4.0.4
[16] base_4.0.4 lattice_0.20-41

AsanchezMD avatar Mar 01 '21 22:03 AsanchezMD

Hi, Sorry for the delay. Would it be possible to share a reproducible example file?

PoisonAlien avatar Mar 08 '21 13:03 PoisonAlien

Hi, Yes I can, how can I send you the file? Can i do email? @PoisonAlien Thank you so much!

AsanchezMD avatar Mar 08 '21 17:03 AsanchezMD

Hi, Yes, e-mail would be better. You can also hide any sensitive information that you don't want to share. [email protected]

PoisonAlien avatar Mar 08 '21 18:03 PoisonAlien

Hi, Thanks for the file and it was helpful to debug the issue.

  1. annovarToMaf has wrongly annotated some of the single base insertions as SNP which causes the Non standard Ti/Tv class: warning.

Example: See second row where ->C is classified as of Variant_Type SNP.

  Reference_Allele Tumor_Seq_Allele2 Variant_Classification Variant_Type
1:                -               CGT           In_Frame_Ins          INS
2:                -                 C        Frame_Shift_Ins          SNP
3:                -                GT        Frame_Shift_Ins          INS

They are ignored anyways so it should be fine.

  1. titv function uses all variants (silent and non-synonymous) for analysis whereas plotmafSummary uses only non-synonymous, hence you are seeing 1 or 7 non-standard Ti/Tv classes.

I will fix the bug soon but overall this should not affect your analysis. Let me know if you have any follow up questions.

PoisonAlien avatar Mar 09 '21 08:03 PoisonAlien

Hi , I have the same issue. This tool is not considering any INS or DEL variants. Everymuttaions classified Into SNP in my data. And the plot only contains SNP data.

sinumol avatar Jul 07 '22 10:07 sinumol

This issue is stale because it has been open for 60 days with no activity.

github-actions[bot] avatar Sep 09 '23 01:09 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Sep 23 '23 01:09 github-actions[bot]