Fit refers to the ability of a model to reproduce the data (i.e., usually the variance-covariance matrix). It should be noted that a good-fitting model is not necessarily a valid model. There are now literally hundreds of measures of fit. Moreover, a model all of whose parameters are zero is of a "good-fitting" model. This page includes some of the major ones, but does not pretend to include all the measures. Though a bit dated, the book edited by Bollen and Long (Testing structural equation models. Newbury Park, CA: Sage, 1993) explains these indexes and others.
Chi Square: &chi2
For models with about 75 to 200 cases, this is a reasonable measure of fit. But for models with more cases, the chi square is almost always statistically significant. Chi square is also affected by the size of the correlations in the model: the larger the correlations, the poorer the fit. For these reasons alternative measures of fit have been developed. (A website for computing p values for chi square.)
Chi Square to df Ratio: &chi2/df
There are no consistent standards for what is considered an acceptable model.
Transforming Chi Square to Z
Sometimes chi square is more interpretable if it is transformed into a Z value. The following appoximation can be used:
Bentler-Bonett Index or Normed Fit Index (NFI)
Define the null model as a model in which all of the correlations
or covariances are zero. The null model is referred to as the "Independence
Model" in AMOS. Its formula is:
Tucker Lewis Index or Non-normed Fit Index (NNFI)
A problem with the Bentler-Bonett index is that there
is no penalty for adding parameters. The Tucker-Lewis index does
have such a penalty. Let
&chi2/df
be the ratio of chi square to its degrees of freedom
Comparative Fit Index (CFI)
This measure is directly based on the non-centrality measure.
Let d = &chi2
- df where df are the degrees
of freedom of the model. The Comparative Fit Index equals
Root Mean Square Error of Approximation (RMSEA)
This measure is based on the non-centrality parameter. Its formula can be shown to equal:
√[([&chi2/df] - 1)/(N - 1)]
where N the sample size and
df the degrees
of freedom of the model. (If &chi2
is less than df, then RMSEA is set to zero.) Good models have an
RMSEA
of .05 or less. Models whose RMSEA is .10 or more have poor fit.
A confidence interval can be computed for this index. First, the value of the
non-centrality parameter is determined by &chi2 - df.
The confidence interval for non-centrality parameter can be determined for
&chi2, df, and the width of the confidence interval. (One
can use the function "CNONCT" within SAS to compute these values. Also
a website
for computing p values for the non-centrality parameter.)
Then these values are substituted for &chi2 - df
into the formula for the RMSEA. Ideally the lower value of the 90% confidence
interval includes or is very near zero and the upper value is not very large,
i.e., less than .08.
Note that the RMSEA can be misleading when the df are small and sample size is not
large. For instance, a chi square of 2.098 (a value not statistically significant),
with a df of 1 and N of 70 yields an RMSEA of .126.
p of Close Fit (PCLOSE)
The null hypothesis is that the RMSEA is .05, a close-fitting model. The
p
value
examines the alternative hypothesis that the RMSEA is greater that .05. So if the
p is
greater than .05, then it is concluded that the fit of the model is "close."
Standardized Root Mean Square
Residual (SRMR)
This measure is the standardized difference between the
observed covariance and predicted covariance. A value of zero indicates
perfect fit. This measure tends to be smaller as sample size increases
and as the number of parameters in the model increases.
A value less than .08 is considered a good fit.
Akaike Information Criterion
(AIC)
The AIC measure indicates a better fit when it is smaller.
The measure is not standardized and is not interpreted for a given model.
For two models estimated from the same data set, the model with the smaller
AIC is to be preferred.
where k is the number of variables in the model and df is the degrees of freedom of the model. Note that k(k - 1) - 2df equals the number of free parameters in the model. The AIC makes the researcher pay a penalty of two for every parameter that is estimated. The absolute value of AIC has relatively little meaning; rather the focus is on the relative size, the model with the smaller AIC being preferred.
Bayesian Information Criterion (BIC) and Adjusted BIC
the AIC pays a penalty of 2 for each parameter estimated. The BIC and adjusted BIC increases the penalty as sample size increases
where ln(N) is the natural logarithm of the number of cases in the
sample. The adjusted BIC replaces ln(N) with ln[(N + 2)/24]. The BIC places a high value on parsimony (perhaps too high). The adjusted BIC, while placing a penalty for adding parameters based on sample, does not place as high a penalty as the BIC. Like the AIC, these measures are not absolute measues and are used to compare the fit of two or more models estimated from the same data set. The adjusted BIC is not given in Amos, but is given in Mplus.
GFI and AGFI (LISREL
measures)
These measures are affected by sample size and can be
large for models that are poorly specified. The current consensus is not
to use these measures.
Hoelter Index
The index states the sample size at
which chi square would not be significant (alpha = .05), i.e., that is how small one's
sample size would have to be for the result to be no longer significant. The index
should only be computed if the chi square is
statistically significant. Its formula is: