Exomiser icon indicating copy to clipboard operation
Exomiser copied to clipboard

p-value differs in different analysis mode (FULL v.s. PASS_ONLY)

Open lacek opened this issue 2 years ago • 0 comments

Description

By adjusting analysisMode between FULL and PASS_ONLY, the resulting P-VALUE (and only P-VALUE) is different for certain variants. The differences are small in the below example though (ranging from 0.0001 to 0.0008).

Since there is no documentation explaining how to interpret P-VALUE. I am not sure if such differences are expected nor how they affect the interpretation.

Versions

  • Exomiser: 13.2.0
  • Data: 2302 (similar differences observed in both hg19 and hg38 data)

Steps to reproduce

mkdir examples
wget -NP examples \
    https://github.com/exomiser/Exomiser/raw/13.2.0/exomiser-cli/src/main/resources/vcf/Pfeiffer.vcf.gz \
    https://github.com/exomiser/Exomiser/raw/13.2.0/exomiser-cli/src/main/resources/vcf/Pfeiffer.vcf.gz.tbi \
    https://github.com/exomiser/Exomiser/raw/13.2.0/exomiser-cli/src/main/resources/examples/preset-exome-analysis.yml
echo "hpoIds: ['HP:0001156', 'HP:0001363', 'HP:0011304', 'HP:0010055']" > examples/Pfeiffer.sample.yml
sed 's/PASS_ONLY/FULL/' examples/preset-exome-analysis.yml > examples/full-exome-analysis.yml

java -jar $EXOMISER_DIR/exomiser-cli-13.2.0.jar --assembly hg19 --output-directory examples --output-format VCF,TSV_GENE,TSV_VARIANT --vcf examples/Pfeiffer.vcf.gz --sample examples/Pfeiffer.sample.yml --output-filename Pfeiffer.pass --analysis examples/preset-exome-analysis.yml

java -jar $EXOMISER_DIR/exomiser-cli-13.2.0.jar --assembly hg19 --output-directory examples --output-format VCF,TSV_GENE,TSV_VARIANT --vcf examples/Pfeiffer.vcf.gz --sample examples/Pfeiffer.sample.yml --output-filename Pfeiffer.full --analysis examples/full-exome-analysis.yml

By excluding those variants found in the output from the full analysis, the results are almost identical except 83 of them differ in the field P-VALUE:

join -j1 \
    <(awk -F'\t' 'NR>2 {print $2"_"$3,$6}' examples/Pfeiffer.pass.variants.tsv | sort -k1,1) \
    <(awk -F'\t' 'NR>2 {print $2"_"$3,$6}' examples/Pfeiffer.full.variants.tsv | sort -k1,1) \
    | awk '$3-$2>0 {print $0,$3-$2}' | sort -k4,4nr

Comparison result in table:

VARIANT PASS FULL DIFF
2-167273315-G-A_AD_SCN7A 0.0970 0.0978 0.0008
19-2415775-C-A_AD_TMPRSS9 0.0824 0.0832 0.0008
9-95263039-T-C_AD_ECM2 0.1017 0.1024 0.0007
9-114131316-T-C_AD_ECPAS 0.0882 0.0889 0.0007
7-20449239-A-G_AD_ITGB8 0.0861 0.0868 0.0007
4-110618781-C-T_AD_CASP6 0.0889 0.0896 0.0007
3-56653839-CTG-C_AD_CCDC66 0.0816 0.0823 0.0007
3-111795803-G-T_AD_TMPRSS7 0.0762 0.0769 0.0007
2-219693329-G-A_AD_PRKAG3 0.0907 0.0914 0.0007
15-30065566-G-A_AD_TJP1 0.0832 0.0839 0.0007
14-105924603-G-T_AD_MTA1 0.0905 0.0912 0.0007
6-132649738-A-G_AD_MOXD1 0.0902 0.0908 0.0006
2-242432770-C-CGAT_AD_FARP2 0.0698 0.0704 0.0006
18-40613838-A-AG_AR_RIT2 0.0856 0.0862 0.0006
15-72502009-C-A_AD_PKM 0.0904 0.0910 0.0006
13-75900394-C-T_AD_TBC1D4 0.1023 0.1029 0.0006
12-8982318-T-C_AD_A2ML1 0.0902 0.0908 0.0006
10-38301222-T-C_AD_ZNF33A 0.0819 0.0825 0.0006
10-3208567-T-TGCACGCTAGGGAAGAGAGAGG_AR_PITRM1 0.0699 0.0705 0.0006
1-249149660-GA-G_AD_ZNF692 0.0902 0.0908 0.0006
1-154901497-T-C_AD_PMVK 0.0902 0.0908 0.0006
9-94519645-G-A_AR_ROR2 0.0693 0.0698 0.0005
9-94486381-G-A_AR_ROR2 0.0693 0.0698 0.0005
3-100605051-C-A_AD_ABI3BP 0.0379 0.0384 0.0005
19-48343000-G-A_AD_CRX 0.1150 0.1155 0.0005
18-77659045-G-C_AD_KCNG2 0.0300 0.0305 0.0005
12-56629033-C-T_AD_SLC39A5 0.1966 0.1971 0.0005
12-2907914-G-A_AD_FKBP4 0.1204 0.1209 0.0005
12-11461553-T-TTTCTGGCTTTCCTGGATGAGGTGGGGGACCTTGGGACTGGTTGCCTCCTTGTGGGGGTCGTCC_AD_PRB4 0.0690 0.0695 0.0005
10-135342118-C-T_AD_CYP2E1 0.1222 0.1227 0.0005
1-235324219-G-A_AR_RBM34 0.2202 0.2207 0.0005
1-235324212-G-A_AR_RBM34 0.2202 0.2207 0.0005
9-8465666-C-G_AD_PTPRD 0.1717 0.1721 0.0004
6-35259091-G-T_AD_ZNF76 0.1928 0.1932 0.0004
6-35258481-G-T_AD_ZNF76 0.1928 0.1932 0.0004
2-209010521-G-A_AD_CRYGB 0.1838 0.1842 0.0004
18-21494777-C-A_AD_LAMA3 0.2021 0.2025 0.0004
12-50366997-T-G_AD_AQP6 0.1668 0.1672 0.0004
12-121416797-G-A_AD_HNF1A 0.0454 0.0458 0.0004
9-8486355-G-A_AR_PTPRD 0.1510 0.1513 0.0003
9-8465666-C-G_AR_PTPRD 0.1510 0.1513 0.0003
9-140611555-A-G_AD_EHMT1 0.1617 0.1620 0.0003
7-26224829-C-G_AD_NFE2L3 0.2256 0.2259 0.0003
6-152751786-GCTT-G_AD_SYNE1 0.1497 0.1500 0.0003
2-73315604-C-T_AD_RAB11FIP5 0.2246 0.2249 0.0003
19-40421270-C-T_AD_FCGBP 0.2173 0.2176 0.0003
17-6483131-T-C_AD_KIAA0753 0.8861 0.8864 0.0003
17-49072486-C-T_AD_SPAG9 0.1931 0.1934 0.0003
14-92537354-C-CCTGCTGCTGCTG_AR_ATXN3 0.1864 0.1867 0.0003
14-75472653-A-G_AD_EIF2B2 0.1266 0.1269 0.0003
13-98829175-C-G_AD_RNF113B 0.2171 0.2174 0.0003
11-77553595-G-C_AD_AAMDC 0.1581 0.1584 0.0003
11-31128494-A-G_AD_DCDC1 0.1540 0.1543 0.0003
10-3141542-G-A_AD_PFKP 0.1592 0.1595 0.0003
10-126715565-C-T_AD_CTBP2 0.2245 0.2248 0.0003
1-27686399-G-T_AD_MAP3K6 0.1756 0.1759 0.0003
14-24705004-G-T_AD_GMPR2 0.1436 0.1438 0.0002
14-23371269-GCA-G_AR_RBM23 0.0169 0.0171 0.0002
14-23371267-CA-C_AR_RBM23 0.0169 0.0171 0.0002
11-119243493-C-T_AD_USP2 0.2137 0.2139 0.0002
10-5011081-A-T_AD_AKR1C1 0.8418 0.8420 0.0002
10-29813396-C-G_AD_SVIL 0.2076 0.2078 0.0002
1-44595144-G-T_AD_KLF17 0.8840 0.8842 0.0002
1-211836799-A-G_AD_NEK2 0.1555 0.1557 0.0002
6-38256069-G-A_AD_BTBD9 0.0147 0.0148 0.0001
6-33281505-C-A_AR_TAPBP 0.8172 0.8173 0.0001
6-33281504-C-A_AR_TAPBP 0.8172 0.8173 0.0001
6-24843131-C-T_AD_RIPOR2 0.2292 0.2293 0.0001
5-172341813-G-A_AD_ERGIC1 0.2904 0.2905 0.0001
4-159091777-C-G_AD_GASK1B 0.8929 0.8930 0.0001
4-154279631-A-G_AD_MND1 0.2911 0.2912 0.0001
3-54933874-C-T_AD_CACNA2D3 0.1461 0.1462 0.0001
20-51871502-G-A_AR_TSHZ2 0.2918 0.2919 0.0001
20-51870120-A-T_AR_TSHZ2 0.2918 0.2919 0.0001
19-37619496-G-C_AD_ZNF420 0.8162 0.8163 0.0001
19-36342551-C-G_AD_NPHS1 0.1288 0.1289 0.0001
17-10551919-G-C_AR_MYH3 0.2324 0.2325 0.0001
17-10547753-G-A_AR_MYH3 0.2324 0.2325 0.0001
17-10547753-G-A_AD_MYH3 0.0197 0.0198 0.0001
16-24834847-C-A_AD_TNRC6A 0.8532 0.8533 0.0001
14-75416073-T-C_AD_PGF 0.1340 0.1341 0.0001
13-110855918-C-G_AD_COL4A1 0.0143 0.0144 0.0001
12-7280850-C-T_AD_RBP5 0.8526 0.8527 0.0001
1-145606274-C-T_AD_POLR3C 0.2476 0.2477 0.0001

lacek avatar Sep 18 '23 08:09 lacek