MR Dictionary

Polygenic risk score (PRS)

Synonyms: Genetic risk score, allele score, risk score, genetic score, polygenic score

A single variable that is produced by aggregating information from several single nucleotide polymorphisms (SNPs) that associate with a trait/phenotype and that can be used in prediction or causal analyses (i.e., MR). Usually, genetic risk scores tend to comprise SNPs that have been identified in genome-wide association studies (GWASs) at a pre-defined level of significance. For example, SNPs that are associated with a trait at a genome-wide significant p-value threshold is used to include SNPs that are most robustly associated with a particular trait. The terms "polygenic risk score", "genetic risk score" and "allele score" are often used interchangeably; however, the term "polygenic risk score" is more often used to a score that includes genetic information on the whole genome (or a much less stringent threshold of inclusion than the genome-wide significance p-value). In an unweighted PRS, the number of "effect" alleles each person in a study has are simply added together. For example, if a score was composed from 5 SNPs associated with a trait and an individual had the numerical genotypes of 0, 1, 1, 2 and 0, referring to the number of trait increasing alleles, their score would be 4. Weighted PRSs multiply the number of "effect" alleles by the magnitude of the association between the SNP and trait of interest. Usually, such an association is quantified by using the beta obtained for each SNP from the GWAS of that trait. In this example, if the differences in mean level of the trait per allele for each of the 5 SNPs were 0.5, 1.0, 0.5, 2.0 and 0.6, respectively, then the weighted score for the same individual would be (0x0.5) + (1x1.0) + (1x0.5) + (2x2.0) + (0x0.6) = 5.5. Some methods of creating weighted PRSs also involve accounting for the number of SNPs and the sum of weights of the available SNPs. Weighted scores are also known as weighted allele sores (WAS).

Genetic risk scores or PRSs have commonly been used in one-sample MR. In comparison to using each genetic variant as a separate instrumental variable (IV), combining them into a single IV can increase statistical power but if some of the variants contributing to the score are invalid (i.e., they violate MR assumptions), results are likely to be biased. In addition, using a score as a single IV limits the ability to test whether each SNP may be violating any of the MR assumptions. However, such aggregate scores can provide consistent estimates of the causal effect in the presence of many weak IVs. PRSs can be used in a two-sample setting when summary statistics have been created using two independent individual-level data samples (or, indeed, the summary statistics of the relationship between a PRS and exposure/outcome are available from two independent samples). For example, if a PRS is created in two independent samples of individual-level data, where the PRS-exposure association and PRS-outcome association have been estimated in those two samples, the summary statistics of these associations can be used to calculate an MR estimate of the exposure-outcome association (e.g., via a Wald ratio). However, the use of a PRS in this context (i.e., one variable comprising multiple SNPs) would limit the utility of other methods developed for summary-level data and the ability to test for violations of the MR assumptions per SNP, given that only one IV is being used.

References

Other terms in 'Useful genetic terms ':