Machine learning, Causal Inference, Algorithmic game theoryなどに興味があります。

Instrumental Variable Estimation | Causal Inference: What if, Chapter 16

This is a personal note for my study. There might be some errors, so if you notice them, I'd be happy to let me know.

The notation here follows Causal Inference: What if.

Here is a link to this book.

16.1 The three instrumental conditions

If Z meets the following three conditions, we say that Z is an instrumental variable (IV).

  1. Z is associated with A (relevance condition)
  2. Z does not affect Y except through its potential effect on A (exclusion restriction)
  3. Z and Y do not share causes

Condition (2) can be written as:

Y_i^{z, a} = Y_i^{z', a} = Y_i^{a} \quad \forall z, z'.

Precisely, condition (3) can be expressed as:

Y^{a, z}\perp  Z \quad \forall a, z.

Empirically, condition (1) can be verified while condition (2) and (3) cannot.

The three conditions are not sufficient to identify the causal effect in the population. One more condition is required. It will be described in 16.3.

Example of instrumental variables

  • A: Smoking, Y: Weights, Z: the price of cigarette
  • A: Intake of alcohol, Y: the risk of alcohol, Z: the genetic factor associated with alcohol metabolism
  • A: COX-2 selective versus non-selective non-steroidal anti-inflammatory drugs, Y: the outcome on gastrointestinal bleeding, U_z: physician's preference for treatments, Z: last prescription issued by the physician before the current prescription
  • Z: access to the treatment, for example, physical distance or travel time to a facility

16.2 The usual IV estimand

When the three conditions above and the additional condition hold, the causal effect is identified as follows:

\mathrm{E}[Y^{a=1}] - \mathrm{E}[Y^{a=0}]= \frac{\mathrm{E}[Y|Z=1] - \mathrm{E}[Y|Z=0]}{\mathrm{E}[A|Z=1] - \mathrm{E}[A|Z=0]}

, which is called the usual IV estimand.

How to compute the IV estimand?

You can estimate each of the four expectations above. That means you fit two saturated linear models such as:

\mathrm{E}[A|Z] = \alpha_0 + \alpha_1Z, \\ \mathrm{E}[Y|Z] = \beta_0 + \beta_1Z.

There is another means to estimate the usual IV estimand. It is a two-stage least squares.

two-stage-least-squares estimator

  • Fit the first-stage treatment model

\mathrm{E}[A|Z] = \alpha_0 + \alpha_1Z.
  • Fit the second-stage outcome mode

\mathrm{E}[Y|Z] = \beta_0 + \beta_1\widehat{\mathrm{E}}[A|Z].

 β_1 will always be numerically equivalent to the standard IV estimate.

However, the confidence interval is usually so wide when A and Z are weakly associated. A commonly used rule of thumb is to declare an instrument as weak if the F-statistic from the first-stage model is less than 10.

structural mean model

The two-stage-least-squares model requires investigators to make strong parametric assumptions. Therefore, sometimes structural mean model can be used and estimated via g-estimation.

16.3 A fourth identifying condition: homogeneity

There are four homogeneity conditions.

  • constant effect of treatment A on outcome Y across individuals (A is dichotomous)

Y_i^{a=1} - Y_i^{a=0} = const \quad \forall i.
  • equality of the average causal effect of A on Y across levels of Z both in the treated and in the untreated. (Z, A are dichotomous)

\mathrm{E}[Y^{a=1} - Y^{a=0}|Z=1, A=a] = \mathrm{E}[Y^{a=1} - Y^{a=0}|Z=0, A=a]\quad \forall a =0,1
  • U is not an additive effect modifier (A is dichotomous)

\mathrm{E}[Y^{a=1}|U] - \mathrm{E}[Y^{a=0}|U] = \mathrm{E}[Y^{a=1}] - \mathrm{E}[Y^{a=0}].
  • Z-A association on the additive scale is constant across levels of the confounders U (no restriction on A)

\mathrm{E}[A | Z = 1, U] - \mathrm{E}[A | Z=0, U] = \mathrm{E}[A|Z=1] - \mathrm{E}[A|Z=0].

The fourth condition has some testable implication. For a dichotomous A, if some of the confounders are measured, it must be the case that the difference is the same across levels of the measured confounders.

For a continuous A, if we are willing to make additional assumptions about linearity, the variance of the treatment A must be constant across levels of the instrument Z.

Homogeneity seems implausible to some people.

16.4 An alternative fourth condition: monotonicity

We investigate an alternative assumption to homogeneity. We define some counterfactual variables to describe a new condition named monotonicity.

A^{z=1}: \text{indicator of the treatment if assigned to no treatment} \\A^{z=0}: \text{indicator of the treatment if assigned to treatment}

Monotonicity is described as follows:

A^{z=1} \geq A^{z=0}.

Assuming this monotonicity property, the usual IV estimand equals

\mathrm{E}[Y^{a=1} - Y^{a=0}|A^{z=1} =1,  A^{z=0} = 0].

Which is the causal effect in the complier. What complier means is Z and A are compatible: A=Z.

Monotonicity seems plausible, but there are some issues discussed below.

Relevance issue

The proportion of the compliers can be small, so if you know the causal effect in the compliers, it would be hard to make a decision to assign the treatment to the overall population.

Observational data

There are likely to be defiers ( A ^ \{ z=1 \} < A ^ \{ z=0 \} ).

ill-defined Partitioning

See Causal Inference: What if.

16.5 The three instrumental conditions revisited

What if the conditions fail to hold?

Condition (1): What if a Z-A association is weak?

  • Wide 95% confidence interval
  • Amplify the bias caused by violations of condition (2) and (3)
  • weak instrumental variable itself causes a bias

Condition (2): What if the absence of a direct effect of the instrument on the outcome does not hold?

\mathrm{E}[Y|Z=1] - \mathrm{E}[Y|Z=0]

If there is a direct effect of the instrument on the outcome, the numerator in the usual IV estimator just above will be incorrectly inflated by the denominator  E[A|Z=1] - E[A|Z=0] as if it were part of the effect of treatment A.

There is another possibility that there is a direct effect of the instrument on the outcome. For example, continuous treatment A is replaced in the analysis by a coarser version A*. A still exists through which Z affects Y. So, there is a direct effect Z → Y.

Condition (3): What if there is confounding for the effect of the instrument on the outcome

The same inflation happens as condition (2).

16.6 Instrumental variable estimation versus other methods

  • Unlike other methods, IV estimation requires modeling assumptions even if infinite data were available.
  • Relatively minor violations of the conditions for IV estimation might result in large biases.
  • The situation is more restrictive, e.g. a truly dichotomous and time-fixed treatment A, a strong and causal proposed instrument Z and homogeneity or monotonicity.