GWASs are hypothesis-free study designs in which a panel of (hundreds of thousands or millions) genetic variants are each systematically tested for association with a single trait or disease outcome. GWASs have the primary objectives of identifying specific variants that can be used for prediction of the trait, and for highlighting gene or loci that are relevant to the aetiology of the trait or disease. Most genetic effects are small, so GWASs need to be performed using large biobanks or collaborations between many studies in which the GWAS results from all studies are meta-analysed. Control for confounding due to population structure, the use of strict significance thresholds and replication of findings in independent samples are key features of reliable GWASs. Commonly, GWAS results are visualised using a Manhattan plot, which shows the log10(p-value) of each SNP by the location of that SNP in the genome by chromosome. All variants that associate with the trait at genome-wide significance are then commonly used as genetic instruments in MR studies.
Potential limitations: - Genome-wide significance is based on a Bonferroni correction that leads to small genetic effect estimates being biased upwards due to winner’s curse. The use of these (‘discovery’) effect estimates in MR, rather than those obtained from independent replication studies can lead to biased effect estimates. - The function of genetic variants identified from GWASs (known genome-wide hits) may be unknown. Particularly for complex traits (such as BMI), where there is likely to be a chain of effects or associations from the gene to the trait, there could be a strong potential for horizontal pleiotropy. - With sample sizes growing ever larger, the risk of subtle population stratification or dynastic effects leading to false positive or biased GWAS findings is growing. Potential strengths when used to identify genetic IVs: - Most GWAS will highlight genetic variants that replicate in (large) independent studies. - Where several variants are identified as potential instruments for a trait, several statistical methods each with differing assumptions can be used and triangulation of results across them.
- Visscher PM, Wray NR, Zhang Q et al. 10 Years of GWAS Discovery: Biology, Function, and Translation. American journal of human genetics 2017;101:5-22.
- Haworth S, Mitchell R, Corbin L et al. Apparent latent structure within the UK Biobank sample has implications for epidemiological analysis. Nature Communications 2019;10:333.
- Bush WS, Moore JH. Chapter 11: Genome-Wide Association Studies. PLoS Computational Biology 2012;8:e1002822.