Look into PRS components of each individual sample in PLINK --score?
Is there any possible way to check which SNPs is included in counting the score of the scoring profile? any flag to put in the PLINK command line?
have you found any way to check which SNps is included in the scoring profile? I used the tutorial to calculate the PRS, besides, I also select the SNPs first, and then use --score directly, I found the results are different, I would like to know the reason for this.
Use —print-snp
Sam
On Sat, Jan 25, 2025 at 1:21 PM Yu_Zhang @.***> wrote:
have you found any way to check which SNps is included in the scoring profile? I used the tutorial to calculate the PRS, besides, I also select the SNPs first, and then use --score directly, I found the results are different, I would like to know the reason for this.
— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRS-Tutorial/issues/48#issuecomment-2614055106, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYVLOUT2YNJDT72RVJL2MPI2VAVCNFSM6AAAAABV3SSIF6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJUGA2TKMJQGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Use —print-snp
Sam …
Thanks Sam, do you mean use --print-snp in PLINK, there is no such flag in PLINK as followed. Thanks
plink --bfile EUR.final_snps_for_score \
> --score base.QC.tsv 3 4 8 header sum \
> --print-snp
PLINK v1.90b7.2 64-bit (11 Dec 2023) www.cog-genomics.org/plink/1.9/
(C) 2005-2023 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to plink.log.
Options in effect:
--bfile EUR.final_snps_for_score
--print-snp
--score base.QC.tsv 3 4 8 header sum
Error: Unrecognized flag ('--print-snp').
For more information, try "plink --help <flag name>" or "plink --help | more".
Oh, that’s for PRSice. Plink uses all SNPs, if it doesn’t, it should output them in the log
Sam
On Sun, Jan 26, 2025 at 12:09 PM Yu_Zhang @.***> wrote:
Use —print-snp
Sam … <#m_-4526959832429802359_>
Thanks Sam, do you mean use --print-snp in PLINK, there is no such flag in PLINK
` plink --bfile EUR.final_snps_for_score \
--score base.QC.tsv 3 4 8 header sum --print-snp PLINK v1.90b7.2 64-bit (11 Dec 2023) www.cog-genomics.org/plink/1.9/ (C) 2005-2023 Shaun Purcell, Christopher Chang GNU General Public License v3 Logging to plink.log. Options in effect: --bfile EUR.final_snps_for_score --print-snp --score base.QC.tsv 3 4 8 header sum
Error: Unrecognized flag ('--print-snp'). For more information, try "plink --help " or "plink --help | more". `
— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRS-Tutorial/issues/48#issuecomment-2614508857, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYTUQH3U2KLNFP45KB32MUJCXAVCNFSM6AAAAABV3SSIF6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJUGUYDQOBVG4 . You are receiving this because you commented.Message ID: @.***>
it is weird that I cannot find the SNPs that is selected for the PRS in the code from tutorial (see the code in the first way below). I also tried to extract those SNPs first (as the code in the 2nd way) and then use --score in PLINK. It has different results as the following code. I have no idea the reason for the difference. I would appreciate it so much if you could help me out.
the first way is using the similar code as in the tutorial
yzh10@login2:/N/slate/yzh10/numom2b/GDM> gunzip -c base.QC.gz | awk '{print $3,$8}' > SNP.pvalue
yzh10@login2:/N/slate/yzh10/numom2b/GDM> echo "0.000001 0 0.000001" > range_list
yzh10@login2:/N/slate/yzh10/numom2b/GDM> gunzip -c base.QC.gz > base.QC.tsv
yzh10@login2:/N/slate/yzh10/numom2b/GDM> plink --bfile EUR.QC \
> --score base.QC.tsv 3 4 8 header \
> --write-snplist \
> --q-score-range range_list SNP.pvalue \
> --extract EUR.valid.snp \
> --out EUR.PRS
PLINK v1.90b7.2 64-bit (11 Dec 2023) www.cog-genomics.org/plink/1.9/
(C) 2005-2023 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to EUR.PRS.log.
Options in effect:
--bfile EUR.QC
--extract EUR.valid.snp
--out EUR.PRS
--q-score-range range_list SNP.pvalue
--score base.QC.tsv 3 4 8 header
--write-snplist
257366 MB RAM detected; reserving 128683 MB for main workspace.
5419600 variants loaded from .bim file.
5830 people (0 males, 5830 females) loaded from .fam.
--extract: 191265 variants remaining.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 5830 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is in [0.9999995, 1).
191265 variants and 5830 people pass filters and QC.
Note: No phenotypes present.
List of variant IDs written to EUR.PRS.snplist .
Warning: 7241950 lines skipped in --score file (7241950 due to variant ID
mismatch, 0 due to allele code mismatch); see EUR.PRS.nopred for details.
--score: 191265 valid predictors loaded.
Warning: 7241951 lines skipped in --q-score-range data file.
--score: 1 range processed.
Results written to EUR.PRS.*.profile.
the results is like this
yzh10@login2:/N/slate/yzh10/numom2b/GDM> head EUR.PRS.0.000001.profile
FID IID PHENO CNT CNT2 SCORE
202946750069_R03C01 202946750069_R03C01 -9 2 1 1.485e-07
202947050020_R05C01 202947050020_R05C01 -9 2 1 1.485e-07
202947050133_R04C01 202947050133_R04C01 -9 2 1 1.485e-07
203010920076_R01C01 203010920076_R01C01 -9 2 0 0
203010920076_R02C01 203010920076_R02C01 -9 2 0 0
203010920076_R04C01 203010920076_R04C01 -9 2 1 1.485e-07
203010920076_R06C01 203010920076_R06C01 -9 2 1 1.485e-07
203010920076_R07C01 203010920076_R07C01 -9 2 2 2.97e-07
203010920090_R01C01 203010920090_R01C01 -9 2 1 1.485e-07
another one is to select SNPs first and then use PLINK --score
yzh10@login2:/N/slate/yzh10/numom2b/GDM> gunzip -c base.QC.gz | awk '$7 < 1e-6 {print $3}' > significant_snps.list
yzh10@login2:/N/slate/yzh10/numom2b/GDM> comm -12 <(sort significant_snps.list) <(sort EUR.valid.snp) > final_snps_for_score.list
yzh10@login2:/N/slate/yzh10/numom2b/GDM> plink --bfile EUR.QC \
> --score base.QC.tsv 3 4 8 header \
> --extract final_snps_for_score.list \
> --out EUR.PRS
PLINK v1.90b7.2 64-bit (11 Dec 2023) www.cog-genomics.org/plink/1.9/
(C) 2005-2023 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to EUR.PRS.log.
Options in effect:
--bfile EUR.QC
--extract final_snps_for_score.list
--out EUR.PRS
--score base.QC.tsv 3 4 8 header
257366 MB RAM detected; reserving 128683 MB for main workspace.
5419600 variants loaded from .bim file.
5830 people (0 males, 5830 females) loaded from .fam.
--extract: 47 variants remaining.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 5830 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is in [0.9999995, 1).
47 variants and 5830 people pass filters and QC.
Note: No phenotypes present.
Warning: 7433168 lines skipped in --score file (7433168 due to variant ID
mismatch, 0 due to allele code mismatch); see EUR.PRS.nopred for details.
--score: 47 valid predictors loaded.
--score: Results written to EUR.PRS.profile .
yzh10@login2:/N/slate/yzh10/numom2b/GDM> head EUR.PRS.profile
FID IID PHENO CNT CNT2 SCORE
202946750069_R03C01 202946750069_R03C01 -9 94 28 0.0111638
202947050020_R05C01 202947050020_R05C01 -9 94 19 -0.000518085
202947050133_R04C01 202947050133_R04C01 -9 94 22 -0.00160957
203010920076_R01C01 203010920076_R01C01 -9 94 23 -1.80851e-05
203010920076_R02C01 203010920076_R02C01 -9 94 17 -0.0035
203010920076_R04C01 203010920076_R04C01 -9 94 26 0.0123585
203010920076_R06C01 203010920076_R06C01 -9 94 26 0.00277128
203010920076_R07C01 203010920076_R07C01 -9 94 23 0.01115
203010920090_R01C01 203010920090_R01C01 -9 94 24 0.00867234
We can see that the results are different in two ways, but they are supposed to be the same.
Could you please provide information as to how the score is different? And what’s the plink log?
Sam
On Sun, Jan 26, 2025 at 12:33 PM Yu_Zhang @.***> wrote:
it is weird that I cannot find the SNPs that is selected for the PRS in the code from tutorial (see the code below). I also tried to extract those SNPs below the P-values threshold first and then use --score in PLINK. It has different results as the following code. I have no idea the reason for the difference. I would appreciate it so much if you could help me out.
plink
--bfile EUR.QC
--score Height.QC.Transformed 3 4 12 header
--q-score-range range_list SNP.pvalue
--extract EUR.valid.snp
--out EUR— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRS-Tutorial/issues/48#issuecomment-2614518365, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYSNHY55HL66GZLJQTD2MUL5NAVCNFSM6AAAAABV3SSIF6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJUGUYTQMZWGU . You are receiving this because you commented.Message ID: @.***>
Could you please provide information as to how the score is different? And what’s the plink log?
Sam …
I have found the bug, thanks for your information.