David A. Kenny
December 18, 2002

Instrumental Variable Estimation


        One way of identifying models that cannot be estimated by using multiple regression is through the use of instrumental variables.  For path analysis,  the disturbance must not be correlated with each causal variable.  There are  three reasons why such a correlation might exist:         Given the above, one or more causal variable is correlated with the disturbance of the endogenous variable.  Thus, multiple regression cannot be used to estimate the causal coefficients.  Denote Y as the endogenous  variable, U as its disturbance, I as an instrumental variable, and Z as  the set of variables that cause Y but not needing an instrumental variable.  The  defining feature of an instrumental variable is that I is assumed not to  directly cause Y:  The path from I to Y is zero.  The zero path is given by  theory, not by statistical analysis.  That is, one should not regress Y on X, I, and Z, and select I by seeing which variables have coefficients that are not significantly different from zero. Conditions for instrumental variable estimation:
1)   The variable I must not directly cause Y or be correlated with U.
2)   For a given structural equation, there must be as many or more I variables as there are variables needing an instrument.
3)   The variable I must cause the variable that needs an instrument.
 (For the details of identification of models with instrumental variable.)

Mechanics of Two-Stage Least Squares (2SLS)
Although this method is not currently used very often for the estimation of  models with instrumental variables, it is instructive to understand how it  works.  2SLS estimation is available as an option within SPSS.
 

In  actuality, 2SLS computer programs execute the two steps in a single stage or step.

2SLS Example
Structural Equations:

                       Z = aX + bY + U
                       Y = cQ + dZ + V

Note that the notation has changed.  For this example, variable Q serves as an instrumental  variable for Y in the Z equation, and X serves as an instrumental variable for Z in the Y equation.

For the Z equation:
        Stage 1: Regress Y on X and Q.
        Stage 2: Regress Z on the stage 1 predicted score for Y and X.

For the Y equation:
        Stage 1: Regress Z on X and Q.
        Stage 2: Regress Y on the stage 1 predicted score for Z and Q.


Go to the next page.
causal modeling logo
Go to the SEM page.