David A. Kenny
November 28, 2009
Multiple Group Models
Basic Question
To what extent is the causal model the same in two or more independent groups. For instance, is a causal model the same for
men and women. Note
that the groups must be independent; if, for instance, both husbands and wives
are measured in the same sample, they must be analyzed
in one analysis. The same holds a causal
model is estimated for the same people at two different times. Note also that the model is different for
groups of persons, and does not vary as a function of a quantitative variable. Finally, the “group variable” must be
measured. If it is unmeasured, we have “latent
class analysis,” a topic not discussed in this page.
Data Preparation
Normally, the raw data are inputted. If the covariance matrix is to be
read, usually it is computationally more efficient to input the correlation
matrix with the set of standard deviations and means. It is almost always
wrong to estimate a multiple group model analyzing the correlation matrices
because groups usually differ in their variances.
Basic Strategy
Normally we begin with the same model but the parameters (e.g., paths and
loadings are different in the groups. We
then impose equality constraints to test invariance. We keep imposing more constraints until we
obtain a poor fitting model. If we think
it is likely that groups are not difference, i.e., there is complete
invariance, we could start the other way. We begin with a model of invariance and we
allow for some differences between groups.
There are many parameters that might be invariant (e.g., 29 in the example below), they need to tested sequentially and in groups. The exact sequence and grouping can vary.
Model I: Find a Common Model
Before beginning to estimate invariance models, it
must be established that a model without any invariances (i.e., the same model
in all groups, but parameters may vary called the configural model) is a
reasonable model. The fit of this model equals the sum of the chi squares and
the sum degrees of freedom across groups and that fit reveals the extent to
which the underlying structure fits the data when no constraints across groups
are added. Before we can decide that parameter estimates are the same, we
must be sure that the model we are estimating is reasonable. Once this is
done, then that model can be used as a basis for comparison to test for invariance.
In comparing models with large sample sizes, one often should use a measure of
fit like the CFI or RMSEA index and not the chi square difference.
Ideally, one searches for the common model using both groups. It is probably inadvisable to use the entire sample because such a strategy uses a mixture of the groups and would be biased toward using the model that favors the larger of the two groups.
Model II: Invariance of Factor
Loadings
Always the first set of values to test for
invariance are the factor loadings. If the
factor loadings are not invariant, then it makes no sense to test the equality
of the paths because the units of measurement would differ across groups.
So if the loadings do not vary, proceed to Model III.
If the loadings are different, the results
very much depend on choice of the marker variable. Consider for instance a case
with four indicators in two groups, the loadings of 2nd
through the 4th indicators are invariant. If the 1st
indicator is used as the marker, it will appear that the other three loadings
are changing across groups when in fact they are invariant. It can be advisable
to change the marker variable to determine which loadings are invariant and
which are not. One may find that some of the loadings are invariant and others
are not; if one has an excess of indictors, one can drop from the model those
loadings that differ.
Model III: Invariance of Paths
The second set of invariances tested is the invariance of
the causal paths. Again this test should only be executed if the loadings
are invariant. Normally invariance of
the individual paths would be test. That
is, a model would estimated the has all but one set of
paths the same in the groups.
Remaining Tests
There
is a strong consensus in the literature that the first three models are tested
in the order that has been given above. There is not much consensus about the order of
tests of the remaining parameters.
One
view is that the next set of tests can be in almost any order, although tests
of covariances should be only done if the variances are invariant (and so are
tests of equality of correlations). Note that if the parameters are not
invariant, then that test might be moved to the bottom list and redone. This is
done, because tests of invariance, presume that the parameters tested above are
invariant.
Model IV: Invariance of Error
Variances
Regardless what has happened above, it is
meaningful to test whether the error variances are the same in both
groups. If the paths vary or if both the loadings and paths vary, such
variation should be allowed for this model.
Model V: Invariance of Error
Covariances
If the error variances are invariant, we can test whether
the error covariances are equal. In essence, this tests the equality of
the error correlations.
Model VI: Invariance of Factor
and Endogenous Disturbance Variances
The next test is whether the factor variances are
equal. This test is meaningful only if the loadings are invariant.
Even if the paths or the error variances vary, variation in the variance can
still be allowed. We may wish to
separate the test of variances and disturbances.
Model VII: Invariance of
Factor and Disturbance Covariances
The final test is whether the factor covariances are
equal. This test is only meaningful if the loadings and the factor variances
are invariant. Given equality of the factor variances, this test
evaluates equality of the factor correlations.
Model VIII: Invariance of
Relative Intercepts
The next set of invariances tested might be the
intercepts of the indicators. For latent
variables with two or more indicators, the indicator intercepts are set equal
across groups. However, the factor means
or intercepts are allow to vary across groups. This
is done by fixing the factor mean or intercept to zero in one group, and
freeing it in the other groups. In
essence what is being tested here is a group by indicator interaction.
Model IX: Invariance of Factor
Intercepts and Means
The next set of invariances tested is the means of the
endogenous factors and intercepts of the endogenous factors are the same in the
different groups. In essence what is
being tested here is the main effect of group.
Note that if this last model has a good fit,
then the groups can be combined into a single sample and group can be ignored.
Interpretation
If a parameter set is deemed to vary across
groups, to interpret those differences examine the estimates of a previous
model in which that parameter set varies.
Neff Example
This example is taken from
Neff, J. A. (1985). Race and vulnerability to stress: An examination of differential vulnerability. Journal of Personality and Social Psychology, 49, 481-491.
The same model is estimated for 658 Whites and 171 Blacks. The following variables in the model using Neff's notation:
There appears to be an error in the standard deviation for education of whites. It is changed to .75.
The measurement model is as follows: The first two variables are indicators of a life change or stress factor. The next three are indicators of a mental health factor which indicates poor mental health. The next two are indicators of socio-economic status or SES and the last is a single indicator variable of age.
The structural model is as follows: Age and SES are exogenous and they each cause the endogenous factors. Stress is assumed to cause mental health. The model is presented in Figure 1 of the paper and below:

This model has 29 parameters in each group (4 loadings, 5 paths, 7 intercepts, 1 mean, 7 error variances, 2 exogenous variances, 2 disturbance variances, and 1 covariance) and 15 degrees of freedom in each group. The results from the nine models described previously are as follows (Model V is not estimated because there are no error covariances):
|
Model |
Tested |
Chi Square |
df |
RMSEA |
TLI |
|
I |
----- |
69.873 |
30 |
.040 |
.940 |
|
II |
loadings |
80.602 |
34 |
.041 |
.938 |
|
III |
paths |
86.162 |
39 |
.038 |
.945 |
|
IV |
error variances |
139.223 |
46 |
.050 |
.909 |
|
VI |
variance & dist. |
187.136 |
50 |
.058 |
.876 |
|
VII |
exog. covariance |
197.723 |
51 |
.059 |
.870 |
|
VIII |
relative intercepts |
218.113 |
55 |
.060 |
.866 |
|
IX |
factor means &
int. |
391.181 |
59 |
.083 |
.746 |
Model I: Although the chi square for this model is statistically significant, the TLI and RMSEA are acceptable. Thus, the model is a reasonably good fitting model.
Model II: For the Neff study, it appears that the loadings are
invariant. We see a slight decline in the TLI and a slight increase in the
RMSEA.
Variable Whites
Blacks Summary
X1
1.000 1.000 Education
more important for Blacks
X2
0.209
0.491
Y1
1.000
1.000 Total
change more important for Whites
Y2
0.657
0.809
Y3
1.000
1.000 Nervous
more important for Whites
Y4
0.988
1.185
Y5
1.002
1.229
Note that in comparing loadings, their relative size needs to be compared. So if Y4 or Y5 is made the marker the marker, it would be seen more clearly that Y3 is the more variable indicator:
Variable
Whites Blacks
Y3
0.998
0.814
Y4
0.986
0.964
Y5
1.000
1.000
That is, Y3 is considerably lower for Blacks than for Whites.
Model III: Equal Paths
Cause
Effect Whites
Blacks Summary
SES
Stress 0.057 -0.009 SES
affects Stress more for Whites
SES Mental-Hh -0.081 -0.097
Age
Stress 0.004 0.006
Age Mental-Hh -0.009 -0.007
Stress Mental-Hh
0.127 0.195 Blacks
more affected by stress
Very often the equality of the paths are of central interest. They can be tested individually by examining the modification indices from Model III. (The square root of a modification index can be treated as an approximate Z test.) They evaluate making that path the only one to be unequal across groups. Below we see that there are no race differences in any of the paths:
Equality of the Individual
Paths
Cause Effect
Z*
SES Stress
1.78
SES Mental
Health 1.19
Age Stress
-1.52
Age Mental
Health -1.30
Stress Mental
Health -1.22
*White path minus the Black path
There is a marginally significant difference that higher SES causes greater Stress more strongly for Whites than Blacks.
Model IV: Equal Error Variances
When we force the error variances to be equal, we find that the fit worsens but only slightly.
Variable Whites
Blacks Summary
X1 5.190
3.524 Whites
more variable
X2
0.405 0.266
Y1
0.107
0.002 Whites
more variable
Y2
0.237 0.185
Y3
0.161
0.237 Blacks
more variable
Y4
0.106
0.196
Y5
0.123 0.180
We see that there is more error variance for the SES and Stress indicators for Whites, but there is more error variance for Black for the mental health factor.
Model VI: Equal Variances of
Exogenous Variables and Endogenous Disturbances
Variable
Whites Blacks Summary
SES
2.886
1.147 Whites
more variable on everything but
Age
330.876 278.556 Mental Health
Stress (U) 0.784
0.402
Mental-Health (V) 0.074 0.154
Note that presuming that the variances in the two groups are equal results in a worsening of fit.
Model VII: Covariance/Correlation of Exogenous Variables and Disturbances
Testing the for the equality of the covariances makes little sense if the variances are not equal, but it is done so for illustrative purposes only. We also note that the result reverses if we allow for mean and intercept differences between the two groups.
Variables
Whites Blacks Summary
SES-Age -11.822/-.42 -21.173/-.76 r more negative for Blacks
Despite the large difference between the correlations, the difference is not statistically significant.
Model VIII: Relative Equal Intercepts
Variable Whites Blacks
Summary
X1 5.980
3.570 X1
relatively higher for Whites
X2 2.190
1.660
Y1 0.538
0.223 Y1 relatively higher for Whites
Y2 0.181
0.101
Y3
0.616 0.663 Y4 relatively higher for Blacks
Y4 0.637
0.864
Y5 0.638
0.754
What matters here is the difference between indicators, not their absolute size.
Model IX: Equal Exogenous Factor Means and Endogenous Factor Intercepts
Variable Whites Blacks
Summary
SES 0.000
-2.370 Whites higher SES
Age (X3) 46.310 42.176
Whites older
Stress 0.000 -0.152 Whites more stress
Mental Health 0.000 -0.012 Whites worse mental health
Because Age is a single indicator and exogenous, it has a mean for both groups. Because SES, Stress, and Mental Health are latent variables and we have already constrained their relative intercepts to be equal, we set their factor means and intercepts to zero in one group (Whites) and free them in the other group(s) (Blacks).
We note although the raw means show that Blacks have poorer mental health than Whites, once Age, SES, and Stress are controlled, Blacks have slightly better mental health. It is noted that the difference between Blacks and Whites on the two endogenous factors (Stress and Mental Health) are not statistically significant.
Summary
Note that because Model IX is not a good fitting model, we cannot pool the data of Whites and Blacks. Likely, the best fitting model is Model III, the model with equal loadings and paths. Perhaps we might wish to additionally allow for equal error variances.
![]()