David A. Kenny
December 21, 2011
Recently updated. Please let me
know if your find any errors or have any suggestions.
Thanks to those of you who have already sent corrections.
Learn how
you can do a mediation analysis and output a text description of your results: Go to
mediational analysis using DataToText.
MEDIATION
Introduction
Baron & Kenny Steps
Indirect Effect
Power
Specification Error
Extensions
Un-discussed Topics
Links to Other Sites
References
Consider a variable X that is assumed to affect another variable Y. The variable X is called the initial variable and the variable that it causes or Y is called the outcome. In diagrammatic form, the unmediated model is
Path c in the above model is called the total effect. The effect of X on Y may be mediated by a process or mediating variable M, and the variable X may still affect Y. The mediated model is
(These two diagrams are essential to the understanding of this page. Please study them carefully!) Path c' is called the direct effect. The mediator has been called an intervening or process variable. Complete mediation is the case in which variable X no longer affects Y after M has been controlled and so path c' is zero. Partial mediation is the case in which the path from X to Y is reduced in absolute size but is still different from zero when the mediator is introduced.
Note that a mediational model is a
causal model. For example, the mediator is presumed to cause the outcome
and not vice versa. If the presumed model is not correct, the results
from the mediational analysis are of little value. Mediation is not
defined statistically; rather statistics can be used to evaluate a presumed
mediational model. The specific causal assumptions are detailed below in
the section on Specification Error.
There is a long history in the study of
mediation (Hyman, 1955; MacCorquodale & Meehl, 1948). Currently
mediation is a very popular topic. (This page averages over 250 visitors
a day and Baron and Kenny (1986) has over 15000 citations.) There are
several reasons for the intense interest in this topic. One reason for
testing mediation is trying to understand the mechanism through which the
initial variable affects the outcome. Mediation (and moderation) analysis
are a key part of what has been called process analysis. Moreover
when most causal or structural models are examined, the mediational part of the
model is the most interesting part of that model.
If the mediational model (see
above) is correctly specified, the paths (c, a, b,
and c') can be estimated by multiple regression, sometimes
call ordinary least squares or OLS. As discussed later, other methods of
estimation (e.g., logistic regression, multilevel modeling, and structural
equal modeling) can be used. Regardless of which data analytic method is
used, the steps necessary for testing mediation are the same. This
section describes the analyses required for testing mediational hypotheses
[previously presented by Baron and Kenny (1986) and Judd and Kenny (1981)]. See
also Frazier,Tix, and Barron (2004) for a more contemporary introduction. We note that the Baron and Kenny (1986) steps
are at best a starting point in a mediational analysis. More contemporary analyses focus on the indirect effect.
The Steps
Baron and Kenny (1986) and Judd and
Kenny (1981) have discussed four steps in establishing mediation:
Step 1: Show
that the initial variable is correlated with the outcome. Use Y as the criterion
variable in a regression equation and X as a predictor (estimate and test path c in the above figure). This step
establishes that there is an effect that may be mediated.
Step 2: Show that
the initial variable is correlated with the mediator. Use M as the
criterion variable in the regression equation and X as a predictor (estimate
and test path a). This step
essentially involves treating the mediator as if it were an outcome variable.
Step 3: Show
that the mediator affects the outcome variable. Use Y as the criterion
variable in a regression equation and X and M as predictors (estimate and test
path b). It is not sufficient
just to correlate the mediator with the outcome; the mediator and the outcome
may be correlated because they are both caused by the initial variable X.
Thus, the initial variable must be controlled in establishing the effect of the
mediator on the outcome.
Step 4: To
establish that M completely mediates the X-Y relationship, the effect of X on Y
controlling for M (path c') should be
zero (see discussion below on significance testing). The effects in both
Steps 3 and 4 are estimated in the same equation.
If all four of these
steps are met, then the data are consistent with the hypothesis that variable M
completely mediates the X-Y relationship, and if the first three steps
are met but the Step 4 is not, then partial mediation is
indicated. Meeting these steps does not, however, conclusively establish
that mediation has occurred because there are other (perhaps less plausible)
models that are consistent with the data. Some of these models are
considered later in the Specification Error section.
James and Brett (1984) have argued that
Step 3 should be modified by not controlling for the initial variable.
Their rationale is that if there were complete mediation, there would be no
need to control for the initial variable. However, because complete
mediation does not always occur, it would seem sensible to control for X in Step
3.
Note that the steps are stated in terms
of zero and nonzero coefficients, not in terms of statistical significance, as
they were in Baron and Kenny (1986). Because trivially small coefficients
can be statistically significant with large sample sizes and very large
coefficients can be nonsignificant with small sample sizes, the steps should
not be defined in terms of statistical significance. Statistical
significance is informative, but other information should be part of
statistical decision making. For instance, consider the case in which
path a is large and b is zero. In this case, c = c'.
It is very possible that the statistical test of c' is not significant (due to the collinearity between X and M),
whereas c is statistically significant.
It would then appear that there is complete mediation when in fact there is no
mediation at all.
Following, Kenny, Kashy, and Bolger
(1998), one might ask whether all of the steps have to be met for there to be
mediation. Most contemporary analysts believe that the essential steps in
establishing mediation are Steps 2 and 3. Certainly, Step 4 does not have to be
met unless the expectation is for complete mediation. In the opinion of
most though not all analysts, Step 1 is not required. (See the Power section below why the test of c can be low power, even if paths a and b are non-trivial.)
However, note that a path from the initial variable to the outcome is implied
if Steps 2 and 3 are met. If c' were
opposite in sign to ab something that MacKinnon, Fairchild, and Fritz (2007)
refer to as inconsistent mediation,
then it could be the case that Step 1 would not be met, but there is still
mediation. In this case the mediator acts like a suppressor
variable. One example of inconsistent
mediation is the relationship between stress and mood as mediated by
coping. Presumably, the direct effect is
negative: more stress, the worse the mood.
However, likely the effect of stress on coping is positive (more stress,
more coping) and the effect of coping on mood is positive (more coping, better
mood), making the indirect effect positive.
The total effect of stress on mood then is likely to be very small
because the direct and indirect effects will tend to cancel each other out. Note too that with inconsistent mediation
that sometimes the direct effect is even larger than the total effect.
The amount of mediation is called
the indirect effect. Note that
the
total effect = direct effect + indirect effect
or using
symbols
c = c' + ab
Note also
that the indirect effect equals the reduction of the effect of the initial
variable on the outcome or ab = c - c'. In contemporary mediational analyses, the
indirect effect or ab is the measure
of the amount of mediation.
The equation of c = c' + ab exactly holds
when a) multiple regression (or structural equation modeling without latent
variables) is used, b) the same cases are used in all the analyses, c) and the
same covariates are in all the equations. However, the two are only
approximately equal for multilevel models, logistic analysis and structural
equation modeling with latent variables. For such models, it is probably
inadvisable to compute c from Step 1,
but rather c should be inferred to be c' +
ab and not directly computed.
Note also that the amount of reduction in the effect
of X on Y due to M is not equivalent to either the change in variance explained
or the change in an inferential statistic such as F or a p
value. It is possible for the F from the initial variable to the
outcome to decrease dramatically even when the mediator has no effect on the
outcome! It is also not equivalent to a change in partial
correlations. The way to measure mediation
is the indirect effect,
A related measure of mediation is
the proportion of the effect that is mediated, or the indirect effect divided
by the total effect or ab/c or equivalently 1 - c'/c.
Such a measure while theoretically informative is very unstable and should not
be computed if c is small. Note
that this measure can be greater than one or even negative when there is inconsistent
mediation. The measure
should only be computed if standardized c
is at least ±.2. The measure can be informative, especially when c' is not statistically
significant. See the example in Kenny et al. (1998) where c' is not statistically significant but
only 56% of c is explained. One rule of thumb is that if one wants to
claim complete mediation ab/c should be at least .80.
Tests of Steps 2 and 3
If Step 2 (the test of a) and Step 3 (the test of b) are met, it follows that there
necessarily is a reduction in the effect of X on Y. One way to test the
null hypothesis that ab = 0 is to test
that both paths a and b are zero (Steps 2 and 3). This
simple approach appears to work rather well (Fritz & MacKinnon, 2007), but
is rarely used. However, Fritz, Taylor, and MacKinnon (2012) have stongly urged that reasearchers
use this test in conjunction with other tests.
Sobel Test
It is much more common and more
highly recommended (MacKinnon, Lockwood, Hoffman, West, & Sheets, 2002) to
perform a single test of ab.
The test was first proposed by Sobel (1982). It requires the
standard error of a or sa (which equals a/ta
where ta is the t test of coefficient a) and the
standard error of b or sb. The Sobel test provides
the standard error of ab can be shown
to equal approximately the square root of
b2sa2 + a2sb2
Other
standard errors have been proposed, but the Sobel test has been by far the most
commonly reported. (See below as bootstrapping is replacing the more
conservative Sobel test.) The test of the indirect effect is given by
dividing ab by the square root of the
above variance and treating the ratio as a Z
test (i.e., larger than 1.96 in absolute value is significant at the .05
level). Kristopher J. Preacher and Geoffrey J. Leonardelli have an
excellent webpage that can help you calculate these test (go to the Sobel test).
Measures and tests of indirect effects are also available within many
structural equation modeling programs. These programs appear to use the
Sobel formula.
The derivation of the Sobel standard
error presumes that the estimates of paths
a and b are independent,
something that is true when the tests are from multiple regression but not true
when other tests are used (e.g., logistic regression, structural equation
modeling, and multilevel modeling). In such cases, the researcher ideally
provides evidence for approximate independence. Additionally, the Sobel
test can be conducted using the standardized or unstandardized
coefficients. Care must be taken to use the appropriate standard errors
if standardized coefficients are used.
The Sobel test
is very conservative (MacKinnon, Warsi, & Dwyer, 1995) and so it has very
low power. The main reason for the test
being conservative is that the sampling distribution of ab is highly skewed. If ab is positive, there is positive skew
with many small estimates of ab and
few very large ones. Because the Sobel
test uses a normal approximation which presumes a symmetric distribution, it
falsely presumes symmetry which leads to a conservative test.
Bootstrapping
An increasingly popular method of
testing the indirect effect is bootstrapping (Bollen & Stine, 1990; Shrout &
Bolger, 2002). Bootstrapping is a
non-parametric method based on resampling with replacement which is done many
times, e.g., 2000 times. From each of
these samples the indirect effect is computed and a sampling distribution can
be empirically generated. Because the
mean of the bootstrapped distribution will not exactly equal the indirect
effect a correction for bias is usually made.
With the distribution, a confidence interval, a p value, or a standard error can be determined. Very typically a confidence interval is
computed and it is checked to determine if zero is in the interval. If zero is not in the interval, then the
researcher can be confident that the indirect effect is different from
zero. Also a p value can determined, but standard errors suffer the same problem
as the Sobel standard errors and are not usually used.
Recently, Fritz, Taylor, and MacKinnon (2012)
have raised concerns that bias-corrected bootstrapping test is too liberal with alpha being around .07.
Actually not doing the bias correction seems to improve the Type 1 error rate.
Hayes and
Preacher have written SPSS and SAS macros that can be downloaded for tests of
indirect effects (
click here to get the Hayes and Preacher macro). Also Mplus and Amos can be used to
bootstrap (click here
for an Amos tutorial). If one has more
than one mediator and is using Amos, one should consult for details Macho and
Ledermann (2011) on how to compute separate confidence intervals for each
indirect effect. Effect Size of the Indirect Effect
The indirect effect is the product of two effects.
One simple way, but not the only way, to determine the effect size is to measure the
product of the two effects, each turned into an effect size.
The effect size for paths a and b is a partial correlation; that is, for path a,
it is the correlation between X and M, controlling for the covariates and any
other Xs and for path b, it is the correlation between M and Y, controlling for
covariates and other Ms and Xs.
The effect size for the indirect effect would be the product of the two partial correlations.
(Preacher and Kelley (2011) discuss a similar measure of effect size which they refer to
as the completely standardized indirect effect.
However, they use betas, not partial correlations.) There are two different strategies for determining small, medium, and large
effect sizes.
First, following Shrout and Bolger (2002), the usual Cohen (1988) standards of .1
for small, .3 for medium, and .5 for large could be used.
Alternatively and I think more appropriately because an indirect effect is a product of two effects, these values should be squared or rr. Thus, a small effect size would be .01, medium would .09, and large would be .25. Note that if X is a dichotomy, it makes sense to replace the correlation for path a with Cohen’s d.
In this case the effect size would be dr and a small effect size would be .02,
medium would .15, and large would be .40.
So far as I know, there does not
currently exist a computer program that can be used to compute the power of the
test of the indirect effect. Perhaps the
best way currently to conduct a power analysis for the indirect effect is to
use Mplus or some other program and run a simulation. Alternatively,
one can determine the power of the test of paths a and b and determine if each has sufficient power.
In determining the power of path b,
make sure to include collinearity due to path a in the calculation (see next section).
Distal and Proximal Mediation
To demonstrate mediation both
paths a and b need to be relatively large. Generally, the maximum size
of the product ab is c, and so as
path a increases, path b must decrease and vice versa. A mediator can be too close in time or in the
process to the initial variable and so path
a would be relatively large and path
b relatively small. An example of a proximal mediator Hoyle and Kenny
(1999) is a manipulation check. The use of a very proximal mediator creates multicollinearity which is
discussed in the next section. Alternatively, the mediator can be
chosen too close to the outcome and with a distal mediator path b is large and path a is small. Ideally in terms of power, standardized a and b should be comparable in size. However, work by Hoyle and
Kenny (1999) shows that the power of the test of ab is maximal when b is
somewhat larger than a in absolute
value. So slightly distal mediators result in somewhat greater power than
proximal mediators. Multicollinearity If M is a successful mediator, it is
necessarily correlated with X due to path a. This correlation, called
collinearity, affects the precision of the estimates of the last set of
regression equations. If X were to explain all of the variance in M, then
there would be no unique variance in M to explain Y. Given that path a is nonzero, the power of the tests of
the coefficients b and c’ is lowered. The effective
sample size for the tests of coefficients
b and c’ is approximately N(1 - r2) where N is
the total sample size and r is the
correlation between the initial variable and the mediator, which is equal to
standardized a. So if M is a strong mediator (path a is large), to achieve equivalent
power, the sample size to test coefficients
b and c' would have to be larger
than what it would be if M were a weak
mediator. Thus, multicollinearity is to be expected in a mediational analysis and it cannot be
avoided. Low Power
for Step 1 If M completely mediates the X to Y
relationship, then c equals ab.
It can easily happen, that a
and b can be statistically
significant but c is not. For instance, if a = b = .4, making c = .16, and N = 100, the power of the test of path a is .99, the power of the test of path b is .97, but the power of the test that c is only .36. Ironically,
it is very easy to have complete mediation, a statistically significant
indirect effect, but no statistical evidence that X causes Y. Mediation is a hypothesis about a causal
network. (See Kraemer, Wilson, Fairburn, and Agras (2002) who attempt to
define mediation without making causal assumptions.) The conclusions from
a mediation analysis are valid only if the causal assumptions are valid (Judd
& Kenny, 2010). In this section, the three major assumptions of
mediation are discussed. Mediation analysis also makes all of the
standard assumptions of the general linear model (i.e., linearity, normality,
homogeneity of error variance, and independence of errors). It is strongly advised to check these assumptions
before conducting a mediational analysis.
Clustering effects are discussed in the Extensions
section. Reverse
Causal Effects The mediator may be caused by the outcome
variable (Y would cause M in the above diagram), what is commonly called a feedback model. When the initial
variable is a manipulated variable, it cannot be caused by either the mediator
or the outcome. But because both the mediator and the outcome variables
are not manipulated variables, they may cause each other. Often it is advisable to interchange the
mediator and the outcome variable and have the outcome "cause" the
mediator. If the results look similar to the specified mediational pattern
(i.e., the c' and b are about the same in the two models), one would be less
confident in the specified model.
However, it should be realized that the direction of causation between M
and Y cannot be determined by statistical analyses. Sometimes reverse causal effects can be
ruled out theoretically. That is, a causal effect in one direction does
not make sense. Design considerations may also weaken
the plausibility of reverse causation. Ideally, the mediator should be measured
temporally before the outcome variable. If it can be assumed that c' is zero, then reverse causal effects
can be estimated. That is, if it can be assumed that there is complete
mediation (X does not directly cause Y and so c’ is zero), the mediator may cause the outcome and the outcome
may cause the mediator and the model can be estimated using instrumental
variable estimation. Smith (1982) has developed another
method for the estimation of reverse causal effects. Both the mediator
and the outcome variables are treated as outcome variables, and they each may
mediate the effect of the other. To be able to employ the Smith approach,
for both the mediator and the outcome, there must be a different variable that
is known to cause each of them but not the other. So a variable must be
found that is known to cause the mediator but not the outcome and another
variable that is known to cause the outcome but not the mediator. These
variables are called instrumental variables. For
such a model, mediation can be estimated and tested with feedback. Measurement
Error in the Mediator If the mediator is measured with less
than perfect reliability, then the effects (b and c')
are likely biased. The effect of the mediator on the outcome (path b) is likely
underestimated and the effect of the initial variable on the outcome (path c') is likely over-estimated if ab is positive (which is typical). The
over-estimation of c' is exacerbated
to the extent to which path a is
large. In a parallel fashion, if X is measured with less than perfect
reliability, then the effects (b and c') are likely biased. The effect of the
M on Y mediator on the outcome (path b)
is likely over-estimated and the effect of the initial variable on the outcome
(path c') is likely
under-estimated. Moreover, measurement
error in X attenuates the estimate of path a
and c. Measurement error in Y does not bias
unstandardized estimates, but it does bias standardized estimates, attenuating
them. To remove the biasing effect of
measurement error, multiple indicators of the variable can be used to tap a
latent variable. Alternatively for M, instrumental
variable estimation can be used, but as before, it must be assumed that c' is zero. Also possible is to
fix the error variance at the value or one minus the reliability quantity times
the variance of the measure. If none of these approaches is used, the
researcher needs to demonstrate that the reliability of the mediator is very
high so that the bias is fairly minimal. Omitted
Variables In this case, there is a variable that
causes both variables in the equation. For example, at Step 3, there is a
variable that causes both the mediator and the outcome. This is the most
difficult specification error to solve and unfortunately this key assumption is
not directly discussed in Baron and Kenny (1986). (It is discussed in
Judd and Kenny (1981).) Although there has been some work on the omitted
variable problem, the only complete solution is to specify and measure such
variables and control for their effects. Note that if the
initial variable, X, is randomized, then omitted variables do not bias the
estimates of a and c. However, in this case, paths b and c' are biased is there is an omitted
variable that causes M and Y. Assuming
that this omitted variable has paths in the same direction on M and Y and that
ab is positive, then path b is over-estimated and path c' is underestimated. In
this case, if the true c' was zero,
then it would appear that there was inconsistent mediation
when in fact there is complete mediation.
Sometimes the source of correlation
between the mediator and the outcome is a common method effect. For instance,
the measuring scale of the two variables is the same. Ideally, efforts
should be made to ensure that the two variables do not share method effects
(e.g., both are self-reports from the same person). A latent
variable analysis might be used to remove the effects of correlated
measurement error. The
Mediator as also a Moderator Baron
and Kenny (1986) and Kraemer et al. (2002) discuss the possibility that M might
interact with X to cause Y. Baron and
Kenny (1986) refer to this as M being both a mediator and a moderator and
Kraemer et al. (2002) as a form of mediation. The X with
M interaction should be estimated and tested and added to the model if present.
One of the best
ways to increase the internal validity of mediational analysis is by the design
of study. Key considerations are
randomizing X (i.e., randomly assigning units to levels of X), the timing of
measurement of M and Y, and obtaining prior values of M and Y. By randomizing X, it is known that both M and
Y do not cause X. By measuring M after X,
and Y after M, it is known that M does not cause X and that Y does not cause X
or M. Finally by obtaining prior
measures of M and Y and control for them, we can reduce and perhaps eliminate
the effects of omitted variables. The
reader should consult Cole and Maxwell (2003) about the difficulties of
estimating mediational effect using a cross-sectional design. Also as mentioned earlier, it is possible to
randomize X, M, and Y (Smith, 1982). Rarely in
mediation are there just the three variables of X, M, and Y. Discussed in this section is how to handle
additional variables in a mediational model. Multiple
Mediators If there are
multiple mediators, they can be tested simultaneously or separately. The
advantage of doing them simultaneously is that one learns if the mediation is
independent of the effect of the other mediators. One should make sure
that the different mediators are conceptually distinct and not too highly
correlated. (Kenny et al. (1998) consider an example with two mediators.)
There is an interesting case of two mediators (see below) in which ab is opposite sign. The sum of
indirect effects for M1 and M2 would be zero. It might then be possible
that c is near zero, because there
are two indirect effects that work in the opposite direction. In this
case "no effect" would be mediated.
The Hayes and
Preacher bootstrapping macro can be used to test hypotheses about the linear
combinations of indirect effects: For example, are they equal? Do they sum to zero? Multiple
Outcomes If there are multiple outcomes, they can
be tested simultaneously or separately. If tested simultaneously, the
entire model can be estimated by structural equation modeling. One might want to consider combining the
multiple outcomes into one or more latent variables. Multiple
Initial Variables In this case there are multiple X
variables and each has an indirect effect on Y.
The Hayes and Preacher bootstrapping macro can be used to test
hypotheses about the linear combinations of indirect effects: For example, are
they equal? Do they sum to zero? One can alternatively treat the multiple X
variables as a formative variable and
so if a single “super variable” can be used to summarize the indirect
effect. As seen below, the formative
variable X “mediates” the effect of X on M and Y. The model can be tested and it has k - 1 degrees of freedom where k is the number of X variables. Thus, the degrees of freedom for the example
would be 1. Covariates There are often variables that do not
change that can cause or be correlated with the initial variable, mediator, and
outcome (e.g., age, gender, and ethnicity); these variables are commonly called
covariates. They would generally be included in each equation and
would not be trimmed from equations unless they are dropped from all of the
equations. If these variables interact with X or M, they would be called
moderator variables. Mediated
Moderation and Moderated Mediation
Moderation means that the effect of a
variable on an outcome is altered (i.e., moderated) by a covariate. (To read
about moderation click here.) Moderation
is usually captured by an interaction between the initial variable and the
covariate. If this moderation is mediated, then we have the usual pattern
of mediation but the X variable is an interaction and the pattern would be
referred to as mediated moderation. All the Baron and Kenny
steps would be repeated with the causal variable or X being an interaction, and
the two main effects would be treated as "covariates." We could compute the total effect or the
original moderation effect, the direct effect or how much moderation exists
after introducing the moderator, and the indirect effect or how much of the
total effect of the moderator is due to the mediator. Sometimes, mediation can be stronger for
one group (e.g., males) than for another (e.g., females), something called moderated mediation. There are two major
different forms of moderated mediation. The effect of the initial
variable on the mediator may differ as a function of the moderator (i.e., path a varies) or the mediator may interact
with the moderator to cause the outcome (i.e., path b varies). It is also
possible that the direct effect or c’ might
change as a function of the moderator. Papers by Muller, Judd, and Yzerbyt
(2005) and Edwards and Lambert (2007) discuss mediated moderation and moderated
mediation and examples of each. Also Preacher, Rucker, and Hayes have
developed a macro for estimating moderated mediation (click
here). Some or all of the mediational variables
might be latent variables. Estimation
would be accomplished using a structural equation modeling (SEM) program (e.g.,
LISREL, Amos, Eqs, or MPlus). Some programs provide measures and tests of
indirect effects. Also such programs are quite flexible in handling
multiple mediators and outcomes. The one complication is how to handle
Step 1. That is, if two models are estimated, one with the mediator and
one without, the paths c and c’ are not comparable because the
factor loadings would be different. It is then inadvisable to test the
relative fit of two structural models, one with the mediator and one
without. Rather c, the total
effect, can be estimated using the formula of c' + ab. Most SEM programs give this estimate. If
there are multiple mediators, Amos does not compute indirect effects for each
mediator. Reader should consult Macho
and Ledermann (2011) for a method that does decompose the total indirect effect
into separate effects. One
advantage of a latent variable model is that correlated measurement error in X,
M, and Y might be modeled. For instance,
instead of using self-report for all the variables, different methods might be
used. Dichotomous
Variables In this case either the mediator or the
outcome is a dichotomy. Having the initial variable be a dichotomy is not
problematic. In this case the analysis would likely be conducted using
logistic regression when the criterion measure is dichotomous. One can
still use the Baron and Kenny steps and the Sobel test. The one
complication is the computation of indirect effect the degree of mediation
because the coefficients need to be transformed. (To read about the
computation of indirect effects using logistic or probit regression click here.) With dichotomous outcomes, it is advisable to
use a program like Mplus that can handle such variables. Clustered
Data Traditional mediation analyses presume
that the data are at just one level.
However, sometimes the data are clustered in that persons are in
classrooms or groups, or the same person is measured over time. With clustered data, multilevel modeling
should be used. Estimation of mediation
within multilevel models can be very complicated, especially when the mediation
occurs at level one and when that mediation is allowed to be random, i.e., vary
across level two units. The reader is referred to Krull and MacKinnon
(1999), Kenny, Korchmaros, and Bolger (2003), and Bauer and Preacher (2006) for
a discussion of this topic. Recently, Preacher, Zyphur, and Zhang (2010) have proposed that multilevel structural
equation methods or MSEM can be used to estimate these models. Ledermann, Macho, and Kenny (2011) discuss
mediational models for dyadic data. Not discussed
here are non-linear mediation (Imai, Keele, & Tingley, 2010) and Baron, R.
M., & Kenny, D. A. (1986). The moderator-mediator variable distinction
in social psychological research: Conceptual, strategic and statistical
considerations. Journal of Personality and Social Psychology, 51,
1173-1182. Bauer, D.
J., Preacher, K. J., & Gil, K. M. (2006). Conceptualizing and testing
random indirect effects and moderated mediation in multilevel models: New
procedures and recommendations. Psychological Methods, 11, 142-163. Bollen, K. A., & Stine, R. (1990).
Direct and indirect effects: Classical and
bootstrap estimates of
variability. Sociological Methodology, 20,
115-40. Cohen, J. (1988). Statistical power
analysis for the behavioral sciences (rev. ed.). Cole,
D. A., & Maxwell, S. E. (2003). Testing mediational models with
longitudinal data: Questions and tips in the use of structural equation
modeling. Journal of Abnormal Psychology,
112, 558-577. Edwards,
J. R., & Lambert L. S. (2007). Methods for integrating moderation and
mediation: A general analytical framework using moderated path analysis. Psychological
Methods, 12, 1-22. Frazier,
P. A., Tix, A. P., & Barron, K. E. (2004). Testing moderator and mediator
effects in counseling psychology research. Journal of Counseling Psychology,
51, 115-134. Fritz, M. S., & MacKinnon, D. P. (2007). Required
sample size to detect the mediated effect. Psychological
Science, 18, 233-239 Fritz, M. S., Taylor, A. B., & MacKinnon, D. P. (2012). Explanation of two anomalous
results in statistical mediation analysis. Mutivariate Behavioral Research,
in press Hoyle, R.
H., & Kenny, D. A. (1999). Statistical power and tests of
mediation. In R. H. Hoyle (Ed.), Statistical strategies for small
sample research. Hyman, H.
H. (1955). Survey design and analysis. James, L.
R., & Brett, J. M. (1984). Mediators, moderators and tests for
mediation. Journal of Applied Psychology, 69, 307-321. Imai,
K., Keele, L., & Tingley, D. (2010). A general approach to causal mediation
analysis. Psychological Methods, 15,
309-334. Judd, C.
M., & Kenny, D. A. (1981). Process analysis: Estimating mediation in
treatment evaluations. Evaluation Review, 5, 602-619. Judd, C. M., & Kenny, D. A. (2010). Data analysis.
In D. Gilbert, S. T. Fiske, G. Lindzey
(Eds.), The handbook of social psychology
(5th ed., Vol. 1, pp. 115-139), Kenny, D. A., Kashy, D. A., & Bolger, N. (1998). Data
analysis in social psychology. In D. Gilbert, S. Fiske, & G. Lindzey
(Eds.), The handbook of social psychology (Vol. 1, 4th ed., pp.
233-265). Kenny, D.
A., Korchmaros, J. D., & Bolger, N. (2003). Lower level
mediation in multilevel models. Psychological Methods, 8, 115-128. Kraemer H.
C., Wilson G. T., Fairburn C. G., & Agras W. S. (2002).
Mediators and moderators of treatment effects in randomized clinical trials. Archives
of General Psychiatry, 59, 877-883. Krull, J.
L. & MacKinnon, D. P. (1999). Multilevel mediation modeling in
group-based intervention studies. Evaluation Review, 23, 418-444. Ledermann, T.,
Macho, S., & Kenny, D. A. (2011). Assessing mediation in dyadic data using
the Actor-Partner Interdependence Model. Structural
Equation Modeling, in press. Macho, S., & Ledermann, T. (2011). Estimating,
testing, and comparing specific effects in structural equation models: The
phantom model approach. Psychological
Methods, 16, 34-43. MacCorquodale,
K., & Meehl, P. E. (1948). On a distinction between hypothetical constructs
and intervening variables. Psychological Review, 55, 95-107. MacKinnon,
D. P., Fairchild, A. J., & Fritz, M. S. (2007). Mediation analysis. Annual
Review of Psychology, 58, 593-614. MacKinnon,
D. P., Lockwood, C. M., Hoffman, J. M., West, S. G., & Sheets, V.
(2002). A comparison of methods to test the significance of the mediated
effect. Psychological Methods, 7, 83-104. MacKinnon,
D. P., Warsi, G., & Dwyer, J. H. (1995). A simulation study of
mediated effect measures. Multivariate Behavioral Research, 30, 41-62. Muller,
D., Judd, C. M., & Yzerbyt, V. Y. (2005). When moderation is mediated and
mediation is moderated. Journal of Personality and Social Psychology, 89,
852-863. Preacher, K. J., Zyphur, M. J., & Zhang, Z.
(2010). A general multilevel SEM framework for assessing multilevel mediation. Psychological
Methods, 15, 209-233. Preacher,
K. J., & Kelley, K. (2011. Effect size measures for mediation
models: Quantitative strategies for
communicating indirect effects. Psychological Methods, 16, 93-115. Shrout, P.
E., & Bolger, N. (2002). Mediation in experimental and
nonexperimental studies: New procedures and recommendations. Psychological
Methods, 7, 422-445. Smith, E.
(1982). Beliefs, attributions, and evaluations: Nonhierarchical models of
mediation in social cognition. Journal of Personality and Social Psychology,
43, 248-259. Sobel, M.
E. (1982). Asymptotic confidence intervals for indirect effects in
structural equation models. In S. Leinhardt (Ed.), Sociological Methodology
1982 (pp. 290-312). 

Doing a mediation
analysis and output a text description of the results.
To find out why
computing partial correlations to test mediation is problematic.
Go to my moderation page.
A paper I have written called "Reflections on Mediation."
Dave MacKinnon’s mediation website.
Kris Preacher's papers and programs.
Andrew Hayes's papers and programs.
Mediation Facebook Pages.
View my PowerPoint presentation on mediation.
Please suggest new links!
Go to the next SEM page.
![]()
Go to the main SEM
page.

Go back to the homepage.