A DAG is a visual representation of potential causal relationships between variables. These relationships are demonstrated by nodes (representing variables) and arrows (a.k.a. edges or arches) between variables. The relationship between two variables in each DAG must be directed (i.e., there cannot be a bidirectional relationship). Relationships between all variables must be acyclic (i.e., a variable cannot have an impact on itself through any number of other variables).
DAGs are useful tools for explicitly demonstrating the underlying assumptions of a proposed analysis. Arrows are drawn between any two variables according to the following criteria:
- An arrow from one variable to a second indicates that you assume that it is plausible that the first variable causes the second.
- Where there is no arrow between one variable and a second, this indicates that you assume that there is no causal relationship between the first and second variable.
Thus, the three key assumptions of
IV analyses are illustrated by an arrow from the IV to exposure; an absence of an arrow from the IV to confounders of exposure and outcome; and an absence of an arrow from the IV directly to the outcome.
Figure 2.1 - Bidirectional Mendelian randomization. Adapted from Richmond et al.
Figure 2.3 - Multivariable Mendelian randomization. Adapted from Zheng et al. Multivariable MR uses multiple instruments (Z1, ..., Zn) associated with multiple, potentially correlated exposures (e.g., X1, X2 and X3) to jointly estimate the independent causal effect of each of the exposures on a particular outcome (Y). It can also be used to explore mediation following two-step MR.
Figure 2.4 - One-sample Mendelian randomization. (A) MR relies on the following three core assumptions: (1) the genetic variant(s) being used as an instrument (Z) is associated with the exposure (X); (2) the instrument in independent of measured and unmeasured confounders (U) of the association between the exposure (X) and outcome (Y); and (3) there is no independent pathway between the instrument (Z) and outcome (Y) other than through the exposure (X) – otherwise known as horizontal pleiotropy or the exclusion restriction assumption. (B) MR can be perceived as being analogous to a randomized controlled trial (RCT), whereby the random assortment of alleles at conception is equivalent to the randomization method with an RCT. This randomization process produces groups of individuals who differ with respect to the intervention (in the case of MR, genetic variation) and between which confounders are equally distributed. Therefore, any differences observed in the outcome of interest between these randomly allocated groups should be due to the exposure with which the genetic variant(s) are associated. (C) In the most simple of scenarios, the causal estimate of the association between the exposure (X) and outcome (Y) can be derived using the instrumental variable ratio method (otherwise known as the Wald ratio), where 𝛽𝐼𝑉 is the causal estimate derived from instrumental variable (i.e., MR) analyses, 𝛽𝑌𝑍 is the association between the instrument (Z) and the outcome (Y) and 𝛽𝑋𝑍 is the association between the instrument (Z) and the exposure (X).
Figure 2.5 - Two-sample Mendelian randomization. Adapted from Hemani et al. (A) In two-sample MR, the associations of the instrument(s) with the exposure and outcome are derived from two independent (i.e., non-overlapping) samples. In this example, there are three SNPs acting as genetic IVs for the hypothetical exposure (i.e., SNP1, SNP2 and SNP3). (B) Manhattan plots showing the SNP-exposure estimates for each of the three SNPs are derived from a genome-wide association studies (GWAS) of the exposure variable. (C) The estimates of association between these three same SNPs and the outcome variable are then obtained from the outcome GWAS (results that are also depicted in a Manhattan plot). (D) Effects are harmonized to ensure that the ‘effect’ estimates in both the exposure and outcome GWASs correspond to the same allele (i.e., one that consistently either increases or decreases the exposure variable) for each SNP. (E) Once effects are harmonized, MR analyses can be performed. Visually, a scatter plot can be generated to represent the results, whereby the slope of the line is equivalent to the causal estimate. For example, the when using the inverse-variance weighted method, the intercept is held at zero.
Figure 2.7 - Two-step Mendelian randomization for exploring mediation. (A) In the first step of two-step MR, a genetic variant (Z1) is used as an instrument for the exposure of interest (X) to estimate the causal impact of the exposure on a hypothesized mediator (M) of the association between the exposure (X) and outcome (Y). (B) In the second step, an independent (of Z1) genetic variant (Z2) is used as an instrument for the mediator (M) to establish the causal impact of the mediator (M) on the outcome (Y). If there is evidence for a causal effect of X on M and M on Y (as well as X on Y), the estimates from these two steps can be combined to provide evidence for or against the mediating role of a variable on the exposure-outcome effect using e.g., multivariable MR.
Figure 4.1 - Collider bias. This figure illustrates collider bias in an MR study exploring the effect of maternal pregnancy body mass index (X: Mat. BMI) on offspring body mass index in later adult life (Y: Off. BMI). There is a clear violation of the assumption that the genetic instrumental variable (Z: Mat. BMI genetic IVs) is not related to the outcome (Y) other than via the exposure (X). It is clear that maternal BMI genetic IVs will influence offspring BMI genes (Off. BMI genes), which in turn will influence offspring BMI. This path can be blocked by adjusting for offspring genetic variants (this is illustrated by the box around offspring BMI genes). However, both maternal and paternal BMI genes (Z: Pat. BMI genes) ‘collide’ on offspring BMI genes and this generates a spurious association between maternal and paternal BMI genes. As only maternal and paternal BMI genes determine offspring BMI genes once we condition maternal genes on offspring genes, we know to some extent what paternal genotype is (e.g., if mother is a homozygote for a BMI increasing allele at a gene and offspring is heterozygous at the same gene, then dad must be heterozygous or homozygous for the alternative (not BMI increasing) allele). If it is not possible to condition on paternal genotype (which is often the case) and paternal BMI genes directly influence offspring BMI, this collider bias will bias the MR analysis of maternal pregnancy BMI on subsequent offspring BMI (26).
Figure 4.2 - Vertical and Horizontal Pleiotropy. Adapted from Hemani et al. and Holmes et al. (A) Classic horizontal pleiotropy, whereby the instrument (Z) for the exposure of interest (X) is independently associated with the outcome (Y) either directly or indirectly through other trait(s) – denoted “?”. Here, this would violate the third assumption of MR and would bias results from an MR study. (B) Indirect horizontal pleiotropy, whereby another SNP (Z2) in linkage disequilibrium (LD) with the instrument (Z1) for the exposure of interest (X) is associated with the outcome (Y) and, due to this correlation between SNPs, the instrument is therefore not independent of the outcome of interest. This is another reason to use independent genetic variants as instruments in an MR analysis and to have some biological knowledge about the mechanisms by which the SNPs are associated with the exposure. (C) A depiction of vertical pleiotropy, whereby the genetic instrument (Z) for the exposure (X) is associated with other trait(s) – denoted “?” – however, this reflects the downstream effects of the exposure that is likely on the causal pathway linking the exposure to the outcome (Y). This is the very essence of MR and is not something that needs to be accounted for in analyses. Measured and unmeasured confounders in all diagrams as represented by “U”, “U1” and “U2”. References