# IBM SPSS Amos User's Guide (part 2)

*Part of the IBM SPSS Amos bundle, converted from the User's Guide PDF (Mathpix). Equations are LaTeX; figures are local images. See `llms.txt` for the index.*

## Results of the Analysis

## Text Output

For this example, Amos succeeds in fitting both the saturated and the independence model. Consequently, all fit measures, including the chi-square statistic, are reported. To see the fit measures:

- Click Model Fit in the tree diagram in the upper left corner of the Amos Output window.

The following is the portion of the output that shows the chi-square statistic for the factor analysis model (called Default model), the saturated model, and the independence model:

| CMIN |  |  |  |  |  |
| :--- | ---: | ---: | ---: | ---: | ---: |
| Model | NPAR | CMIN | DF | P | CMIN/DF |
| Default model | 19 | 11.547 | 8 | .173 | 1.443 |
| Saturated model | 27 | .000 | 0 |  |  |
| Independence model | 6 | 117.707 | 21 | .000 | 5.605 |

The chi-square value of 11.547 is not very different from the value of 7.853 obtained in Example 8 with the complete dataset. In both analyses, the $p$ values are above 0.05 .

Parameter estimates, standard errors, and critical ratios have the same interpretation as in an analysis of complete data.

| Regression Weights: (Group number 1 - D |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- |
|  | Estimate | S.E. | C.R. | P | Label |
| visperc | 1.000 |  |  |  |  |
| cubes | . 511 | . 153 | 3.347 | *** |  |
| lozenges | 1.047 | . 316 | 3.317 | *** |  |
| paragrap | 1.000 |  |  |  |  |
| sentence | 1.259 | . 194 | 6.505 | *** |  |
|  | 2.140 | . 326 | 6.572 | *** |  |


Intercepts: (Group number 1 - Default model)
|  | Estimate | S.E. | C.R. | P | Label |
| :--- | ---: | ---: | ---: | :--- | :--- |
| visperc | 28.885 | .913 | 31.632 | *** |  |
| cubes | 24.998 | .536 | 46.603 | *** |  |
| lozenges | 15.153 | 1.133 | 13.372 | *** |  |
| wordmean | 18.097 | 1.055 | 17.146 | *** |  |
| paragrap | 10.987 | .468 | 23.495 | *** |  |
| sentence | 18.864 | .636 | 29.646 | *** |  |


Covariances: (Group number 1 - Default model)
Estimate S.E. C.R. P Label
verbal <-> spatial $\quad \begin{array}{llll}7.993 & 3.211 & 2.490 & .013\end{array}$

Variances: (Group number 1 - Default model)
|  | Estimate | S.E. | C.R. | P | Label |
| :--- | :--- | :--- | :--- | :--- | :--- |
| spatial | 29.563 | 11.600 | 2.549 | . 011 |  |
| verbal | 10.814 | 2.743 | 3.943 | *** |  |
| err_v | 18.776 | 8.518 | 2.204 | . 028 |  |
| err_c | 8.034 | 2.669 | 3.011 | . 003 |  |
| err_l | 36.625 | 11.662 | 3.141 | . 002 |  |
| err_p | 2.825 | 1.277 | 2.212 | . 027 |  |
| err_s | 7.875 | 2.403 | 3.277 | . 001 |  |
| err_w | 22.677 | 6.883 | 3.295 | *** |  |


Standardized estimates and squared multiple correlations are as follows:

## Standardized Regression Weights: (Group number 1 Default model)

|  |  | Estimate |  |
| :--- | ---: | :--- | ---: |
| visperc | く--- | spatial | .782 |
| cubes | <--- | spatial | .700 |
| lozenges | <--- | spatial | .685 |
| paragrap | <--- | verbal | .890 |
| sentence | <--- | verbal | .828 |
| wordmean <--- | verbal | .828 |  |

## Correlations: (Group number 1 - Default model)

Estimate

verbal <--> spatial . 447

## Squared Multiple Correlations: (Group number 1 Default model)

|  | Estimate |
| :--- | ---: |
| wordmean | .686 |
| sentence | .685 |
| paragrap | .793 |
| lozenges | .469 |
| cubes | .490 |
| visperc | .612 |

## Example 17

## Graphics Output

Here is the path diagram showing the standardized estimates and the squared multiple correlations for the endogenous variables:

$$
\begin{array}{r}
\text { Chi square }=11.547 \\
d f=8 \\
p=.173
\end{array}
$$

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-69e1437c92.jpg)
Example 17

Factor analysis with missing data
Holzinger and Swineford (1939): Girls' sample
Standardized estimates
The standardized parameter estimates may be compared to those obtained from the complete data in Example 8. The two sets of estimates are identical in the first decimal place.

## Modeling in VB.NET

When you write an Amos program to analyze incomplete data, Amos does not automatically fit the independence and saturated models. (Amos Graphics does fit those models automatically.) If you want your Amos program to fit the independence and saturated models, your program has to include code to specify those models. In particular, in order for your program to compute the usual likelihood ratio chi-square statistic, your program must include code to fit the saturated model.

This section outlines three steps necessary for computing the likelihood ratio chisquare statistic:

- Fitting the factor model
- Fitting the saturated model
- Computing the likelihood ratio chi-square statistic and its $p$ value

First, the three steps are performed by three separate programs. After that, the three steps will be combined into a single program.

## Fitting the Factor Model (Model A)

The following program fits the confirmatory factor model (Model A). It is saved as Ex17-a.vb.

```
Sub Main()
    Dim Sem As New AmosEngine
    Try
        Sem.Title("Example 17 a: Factor Model")
        Sem.TextOutput()
        Sem.Standardized()
        Sem.Smc()
        Sem.AllImpliedMoments()
        Sem.ModelMeansAndIntercepts()
        Sem.BeginGroup(Sem.AmosDir & "Examples\Grant_x.sav")
        Sem.AStructure("visperc = ( ) + (1) spatial + (1) err_v")
        Sem.AStructure("cubes = ( ) + spatial + (1) err_c")
        Sem.AStructure("lozenges = ( ) + spatial + (1) err_l")
        Sem.AStructure("paragrap = ( ) + (1) verbal + (1) err_p")
        Sem.AStructure("sentence = ( ) + verbal + (1) err_s")
        Sem.AStructure("wordmean = ( ) + verbal + (1) err_w")
        Sem.FitModel()
    Finally
        Sem.Dispose()
    End Try
End Sub
```

Notice that the ModelMeansAndIntercepts method is used to specify that means and intercepts are parameters of the model, and that each of the six regression equations contains a set of empty parentheses representing an intercept. When you analyze data with missing values, means and intercepts must appear in the model as explicit parameters. This is different from the analysis of complete data, where means and intercepts do not have to appear in the model unless you want to estimate them or constrain them.

The fit of Model A is summarized as follows:
Function of log likelihood = 1375.133
Number of parameters = 19
The Function of log likelihood value is displayed instead of the chi-square fit statistic that you get with complete data. In addition, at the beginning of the Summary of models section of the text output, Amos displays the warning:

The saturated model was not fitted to the data of at least one group. For this reason, only the 'function of log likelihood', AIC and BCC are reported. The likelihood ratio chi-square statistic and other fit measures are not reported.

Whenever Amos prints this note, the values in the cmin column of the Summary of models section do not contain the familiar fit chi-square statistics. To evaluate the fit of the factor model, its Function of log likelihood value has to be compared to that of some less constrained baseline model, such as the saturated model.

## Fitting the Saturated Model (Model B)

The saturated model has as many free parameters as there are first and second order moments. When complete data are analyzed, the saturated model always fits the sample data perfectly (with chi-square $=0.00$ and $d f=0$ ). All structural equation models with the same six observed variables are either equivalent to the saturated model or are constrained versions of it. A saturated model will fit the sample data at least as well as any constrained model, and its Function of log likelihood value will be no larger and is, typically, smaller.

The following program fits the saturated model (Model B). The program is saved as Ex17-b.vb.

```
Sub Main()
    Dim Saturated As New AmosEngine
    Try
        'Set up and estimate Saturated model:
        Saturated.Title("Example 17 b: Saturated Model")
        Saturated.TextOutput()
        Saturated.AllImpliedMoments()
        Saturated.ModelMeansAndIntercepts()
        Saturated.BeginGroup(Saturated.AmosDir & "Examples\Grant_x.sav")
        Saturated.Mean("visperc")
        Saturated.Mean("cubes")
        Saturated.Mean("lozenges")
        Saturated.Mean("paragrap")
        Saturated.Mean("sentence")
        Saturated.Mean("wordmean")
        Saturated.FitModel()
    Finally
        Saturated.Dispose()
    End Try
End Sub
```

Following the BeginGroup line, there are six uses of the Mean method, requesting estimates of means for the six variables. When Amos estimates their means, it will automatically estimate their variances and covariances as well, as long as the program does not explicitly constrain the variances and covariances.

The following are the unstandardized parameter estimates for the saturated Model B:

| Means: (Group number 1 - Model 1) |  |  |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
|  | Estimate | S.E. | C.R. |  | P | Label |  |  |
| visperc <br> 28.883 <br> . 910 <br> 31.756 <br> *** |  |  |  |  |  |  |  |  |
| cubes | 25.154 | . 540 | 46.592 |  | *** |  |  |  |
| lozenges | 14.962 | 1.101 | 13.591 |  | *** |  |  |  |
| paragrap | 10.976 | . 466 | 23.572 |  | *** |  |  |  |
| sentence | 18.802 | . 632 | 29.730 |  | *** |  |  |  |
| wordmean | 18.263 | 1.061 | 17.211 |  | *** |  |  |  |
| Covariances: (Group number 1 - Model 1) |  |  |  |  |  |  |  |  |
|  |  |  | Estimate |  | S.E. |  |  | C.R. | P | Label |
| visperc | <-> cubes |  | 17.484 |  | 4.614 | 3.789 | *** |  |  |
| visperc | <--> lozenges |  | 31.173 |  | 9.232 | 3.377 | *** |  |  |
| cubes | <--> lozenges |  | 17.036 |  | 5.459 | 3.121 | . 002 |  |  |
| visperc | <--> paragrap |  | 8.453 |  | 3.705 | 2.281 | . 023 |  |  |
| cubes | <--> paragrap |  | 2.739 |  | 2.179 | 1.257 | . 209 |  |  |
| lozenges |  |  | 9.287 |  | 4.596 | 2.021 | . 043 |  |  |
| visperc | <--> sentence |  | 14.382 |  | 5.114 | 2.813 | . 005 |  |  |
| cubes | <--> sentence |  | 1.678 |  | 2.929 | . 573 | . 567 |  |  |
| lozenges | <--> sentence |  | 10.544 |  | 6.050 | 1.743 | . 081 |  |  |
| paragrap | <--> sentence |  | 13.470 |  | 2.945 | 4.574 | *** |  |  |
| visperc | <--> wordmean |  | 14.665 |  | 8.314 | 1.764 | . 078 |  |  |
| cubes | <-> wordmean |  | 3.470 |  | 4.870 | . 713 | . 476 |  |  |
| lozenges | <--> wordmean |  | 29.655 |  | 10.574 | 2.804 | . 005 |  |  |
| paragrap < | --> wordmean |  | 23.616 |  | 5.010 | 4.714 | *** |  |  |
| sentence | <-> wordmean |  | 29.577 |  | 6.650 | 4.447 | *** |  |  |
| Variances: (Group number 1 - Model 1) |  |  |  |  |  |  |  |  |  |
|  | Estimate | S.E. | C.R. |  | P | Label |  |  |  |
| visperc | 49.584 | 9.398 | 5.276 |  | *** |  |  |  |  |
| cubes | 16.484 | 3.228 | 5.106 |  | *** |  |  |  |  |
| lozenges | 67.901 | 13.404 | 5.066 |  | *** |  |  |  |  |
| paragrap | 13.570 | 2.515 | 5.396 |  | *** |  |  |  |  |
| sentence | 25.007 | 4.629 | 5.402 |  | *** |  |  |  |  |
| wordmean | 73.974 | 13.221 | 5.595 |  | *** |  |  |  |  |

The AllImpliedMoments method in the program displays the following table of estimates:

| Implied (for all variables) Covariances (Group number 1 - Model 1) |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
|  | wordmean | sentence | paragrap | lozenges | cubes | visperc |
| wordmean | 73.974 |  |  |  |  |  |
| sentence | 29.577 | 25.007 |  |  |  |  |
| paragrap | 23.616 | 13.470 | 13.570 |  |  |  |
| lozenges | 29.655 | 10.544 | 9.287 | 67.901 |  |  |
| cubes | 3.470 | 1.678 | 2.739 | 17.036 | 16.484 |  |
| visperc | 14.665 | 14.382 | 8.453 | 31.173 | 17.484 | 49.584 |
| Implied (for all variables) Means (Group number 1 - Model 1) |  |  |  |  |  |  |
| wordmean | sentence | paragrap | lozenges | cubes | visperc |  |
| 18.263 | 18.802 | 10.976 | 14.962 | 25.154 | 28.883 |  |

These estimates, even the estimated means, are different from the sample values computed using either pairwise or listwise deletion methods. For example, 53 people took the visual perception test (visperc). The sample mean of those 53 visperc scores is 28.245 . One might expect the Amos estimate of the mean visual perception score to be 28.245 . In fact it is 28.883 .

Amos displays the following fit information for Model B:
Function of log likelihood = 1363.586
Number of parameters $=27$
Function of log likelihood values can be used to compare the fit of nested models. In this case, Model A (with a fit statistic of 1375.133 and 19 parameters) is nested within Model B (with a fit statistic of 1363.586 and 27 parameters). When a stronger model (Model A) is being compared to a weaker model (Model B), and where the stronger model is correct, you can say the following: The amount by which the Function of log likelihood increases when you switch from the weaker model to the stronger model is an observation on a chi-square random variable with degrees of freedom equal to the difference in the number of parameters of the two models. In the present example, the Function of log likelihood for Model A exceeds that for Model B by 11.547 ( $=1375.133-1363.586$ ). At the same time, Model A requires estimating only 19 parameters while Model B requires estimating 27 parameters, for a difference of 8 . In other words, if Model A is correct, 11.547 is an observation on a chi-square variable with 8 degrees of freedom. A chi-square table can be consulted to see whether this chisquare statistic is significant.

## Computing the Likelihood Ratio Chi-Square Statistic and P

Instead of consulting a chi-square table, you can use the ChiSquareProbability method to find the probability that a chi-square value as large as 11.547 would have occurred with a correct factor model. The following program shows how the ChiSquareProbability method is used. The program is saved as Ex17-c.vb.

```
Sub Main()
    Dim ChiSquare As Double, P As Double
    Dim Df As Integer
    ChiSquare = 1375.133-1363.586 'Difference in functions of log-likelihood
    Df = 27-19
    P = AmosEngine.ChiSquareProbability(ChiSquare, CDbl(Df))
    Debug.WriteLine( "Fit of factor model:")
    Debug.WriteLine( "Chi Square = " & ChiSquare.ToString("#,##0.000"))
    Debug.WriteLine("DF = " & Df)
    Debug.WriteLine("P = " & P.ToString("0.000"))
End Sub
```

The program output is displayed in the Debug output panel of the program editor.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-778cf6e121.jpg)

The $p$ value is 0.173 ; therefore, we accept the hypothesis that Model A is correct at the 0.05 level.

As the present example illustrates, in order to test a model with incomplete data, you have to compare its fit to that of another, alternative model. In this example, we wanted to test Model A, and it was necessary also to fit Model B as a standard against which Model A could be compared. The alternative model has to meet two requirements. First, you have to be satisfied that it is correct. Model B certainly meets this criterion, since it places no constraints on the implied moments, and cannot be wrong. Second, it must be more general than the model you wish to test. Any model that can be obtained by removing some of the constraints on the parameters of the model under test will meet this second criterion. If you have trouble thinking up an alternative model, you can always use the saturated model, as was done here.

## Performing All Steps with One Program

It is possible to write a single program that fits both models (the factor model and the saturated model) and then calculates the chi-square statistic and its $p$ value. The program in Ex17-all.vb shows how this can be done.

## More about Missing Data

## Introduction

This example demonstrates the analysis of data in which some values are missing by design and then explores the benefits of intentionally collecting incomplete data.

## Missing Data

Researchers do not ordinarily like missing data. They typically take great care to avoid these gaps whenever possible. But sometimes it is actually better not to observe every variable on every occasion. Matthai (1951) and Lord (1955) described designs where certain data values are intentionally not observed.

The basic principle employed in such designs is that, when it is impossible or too costly to obtain sufficient observations on a variable, estimates with improved accuracy can be obtained by taking additional observations on other correlated variables.

Such designs can be highly useful, but because of computational difficulties, they have not previously been employed except in very simple situations. This example describes only one of many possible designs where some data are intentionally not collected. The method of analysis is the same as in Example 17.

## About the Data

For this example, the Attig data (introduced in Example 1) was modified by eliminating some of the data values and treating them as missing. A portion of the modified data file for young people, Atty_missav, is shown below as it appears in the SPSS Statistics Data Editor. The file contains scores of Attig's 40 young subjects on the two vocabulary tests v_short and vocab. The variable vocab is the WAIS vocabulary score. V_short is the score on a small subset of items on the WAIS vocabulary test. Vocab scores were deleted for 30 randomly picked subjects.

|  | v_short | vocab |
| :--- | :--- | :--- |
| 7 | 6.00 | 51.00 |
| 8 | 9.00 | 52.00 |
| 9 | 8.00 | 60.00 |
| 10 | 5.00 | 48.00 |
| 11 | 13.00 | . |
| 12 | 12.00 | . |
| 13 | 14.00 | . |
| 14 | 4.00 | . |
| 15 | 5.00 | . |

A second data file, Atto_mis.sav, contains vocabulary test scores for the 40 old subjects, again with 30 randomly picked vocab scores deleted.

|  | v_short | vocab |
| :--- | :--- | :--- |
| 7 | 10.00 | 67.00 |
| 8 | 6.00 | 47.00 |
| 9 | 4.00 | 47.00 |
| 10 | . 00 | 40.00 |
| 11 | 12.00 | . |
| 12 | 14.00 | . |
| 13 | 13.00 | . |
| 14 | 6.00 | . |
| 15 | 7.00 | . |

Of course, no sensible person deletes data that have already been collected. In order for this example to make sense, imagine this pattern of missing data arising in the following circumstances.

Suppose that vocab is the best vocabulary test you know of. It is highly reliable and valid, and it is the vocabulary test that you want to use. Unfortunately, it is an expensive test to administer. Maybe it takes a long time to give the test, maybe it has to be administered on an individual basis, or maybe it has to be scored by a highly trained person. V_short is not as good a vocabulary test, but it is short, inexpensive, and easy to administer to a large number of people at once. You administer the cheap test, v_short, to 40 young and 40 old subjects. Then you randomly pick 10 people from each group and ask them to take the expensive test, vocab.

Suppose the purpose of the research is to:

- Estimate the average vocab test score in the population of young people.
- Estimate the average vocab score in the population of old people.
- Test the hypothesis that young people and old people have the same average vocab score.

In this scenario, you are not interested in the average $v \_$short score. However, as will be demonstrated below, the $v \_$short scores are still useful because they contain information that can be used to estimate and test hypotheses about vocab scores.

The fact that missing values are missing by design does not affect the method of analysis. Two models will be fitted to the data. In both models, means, variances, and the covariance between the two vocabulary tests will be estimated for young people and also for old people. In Model A, there will be no constraints requiring parameter estimates to be equal across groups. In Model B, vocab will be required to have the same mean in both groups.

## Model A

To estimate means, variances, and the covariance between vocab and v_short, set up a two-group model for the young and old groups.

- Draw a path diagram in which vocab and v_short appear as two rectangles connected by a double-headed arrow.
- From the menus, choose View > Analysis Properties.
- In the Analysis Properties dialog, click the Estimation tab.
- Select Estimate means and intercepts (a check mark appears next to it).
- While the Analysis Properties dialog is open, click the Output tab.
- Select Standardized estimates and Critical ratios for differences.

Because this example focuses on group differences in the mean of vocab, it will be useful to have names for the mean of the young group and the mean of the old group. To give a name to the mean of vocab in the young group:

- Right-click the vocab rectangle in the path diagram for the young group.
- Choose Object Properties from the pop-up menu.
- In the Object Properties dialog, click the Parameters tab.
- Enter a name, such as ml_yng, in the Mean text box.
- Follow the same procedure for the old group. Be sure to give the mean of the old group a unique name, such as $m 1 \_$old.

Naming the means does not constrain them as long as each name is unique. After the means are named, the two groups should have path diagrams that look something like this:

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-c6a9fe1ab8.jpg)
Example 18: Model A Incompletely observed data. Attig (1983) young subjects Model Specification

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-88e56f42bc.jpg)
Example 18: Model A Incompletely observed data. Attig (1983) old subjects Model Specification

## Results for Model A

## Graphics Output

Here are the two path diagrams containing means, variances, and covariances for the young and old subjects respectively:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-26e5475b7f.jpg)

Example 18: Model A Incompletely observed data. Attig (1983) young subjects Unstandardized estimates
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-185f1bcf94.jpg)

Example 18: Model A Incompletely observed data.
Attig (1983) old subjects Unstandardized estimates

## Text Output

In the Amos Output window, click Notes for Model in the upper left pane.
The text output shows that Model A is saturated, so that the model is not testable.

| Number of distinct sample moments: | 10 |
| ---: | :---: |
| Number of distinct parameters to be estimated: | 10 |
| Degrees of freedom $(10-10):$ | 0 |

Example 18

The parameter estimates and standard errors for young subjects are:

| Means: (young subjects - Default model) |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- |
|  | Estimate | S.E. | C.R. | P | Label |
| vocab $56.891 \quad 1.765 \quad 32.232 \quad$ *** m1_yng |  |  |  |  |  |
| v_short 7.950 . 62712.673 *** par_4 |  |  |  |  |  |
|  | Covariances: (young subjects - Default model) |  |  |  |  |
| Estimate S.E. C.R. P Label |  |  |  |  |  |
| vocab <--> v_short 32.9168 .6943 .786 *** par_3 |  |  |  |  |  |
| Correlations: (young subjects - Default model) |  |  |  |  |  |
| Estimate |  |  |  |  |  |
| vocab <--> v_short |  |  |  |  |  |
| $\text { Marian } 920$ <br> Variances: (young subjects - Default model) |  |  |  |  |  |
| Estimate S.E. C.R. P Label |  |  |  |  |  |
| vocab $83.320 \quad 25.639 \quad 3.250 .001 \quad$ par_7 |  |  |  |  |  |
|  |  |  |  |  |  |

The parameter estimates and standard errors for old subjects are:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-42a8e84e73.jpg)

The estimates for the mean of vocab are 56.891 in the young population and 65.001 in the old population. Notice that these are not the same as the sample means that would have been obtained from the 10 young and 10 old subjects who took the vocab test. The sample means of 58.5 and 62 are good estimates of the population means (the best that can be had from the two samples of size 10), but the Amos estimates (56.891 and 65.001) have the advantage of using information in the $v \_$short scores.

How much more accurate are the mean estimates that include the information in the $v \_$short scores? Some idea can be obtained by looking at estimated standard errors. For the young subjects, the standard error for 56.891 shown above is about 1.765 , whereas the standard error of the sample mean, 58.5, is about 2.21. For the old subjects, the standard error for 65.001 is about 2.167 while the standard error of the sample mean,

62, is about 4.21. Although the standard errors just mentioned are only approximations, they still provide a rough basis for comparison. In the case of the young subjects, using the information contained in the $v_{-}$short scores reduces the standard error of the estimated vocab mean by about $21 %$. In the case of the old subjects, the standard error was reduced by about $49 %$.

Another way to evaluate the additional information that can be attributed to the $v \_$short scores is by evaluating the sample size requirements. Suppose you did not use the information in the $v \_$short scores. How many more young examinees would have to take the vocab test to reduce the standard error of its mean by $21 %$ ? Likewise, how many more old examinees would have to take the vocab test to reduce the standard error of its mean by $49 %$ ? The answer is that, because the standard error of the mean is inversely proportional to the square root of the sample size, it would require about 1.6 times as many young subjects and about 3.8 times as many old subjects. That is, it would require about 16 young subjects and 38 old subjects taking the vocab test, instead of 10 young and 10 old subjects taking both tests, and 30 young and 30 old subjects taking the short test alone. Of course, this calculation treats the estimated standard errors as though they were exact standard errors, and so it gives only a rough idea of how much is gained by using scores on the $v_{\_ \text {short test. }}$

Do the young and old populations have different mean vocab scores? The estimated mean difference is 8.110 (65.001-56.891). A critical ratio for testing this difference for significance can be found in the following table:

| Critical Ratios for Differences between Parameters (Default model) |  |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
|  | m1 | _yng | par_3 | par_4 | par_5 | par_6 | par_7 |
| m1_yng | . 000 |  |  |  |  |  |  |
| m1_old | 2.901 | . 000 |  |  |  |  |  |
| par_3 | -2.702 | -3.581 | . 000 |  |  |  |  |
| par_4 | -36.269 | -25.286 | -2.864 | . 000 |  |  |  |
| par_5 | -2.847 | -3.722 | -. 111 | 2.697 | . 000 |  |  |
| par_6 | -25.448 | -30.012 | -2.628 | 2.535 | -2.462 | . 000 |  |
| par_7 | 1.028 | . 712 | 2.806 | 2.939 | 1.912 | 2.858 | . 000 |
| par_8 | -10.658 | -12.123 | -2.934 | 2.095 | -1.725 | 1.514 | -2.877 |
| par_9 | 1.551 | 1.334 | 2.136 | 2.859 | 2.804 | 2.803 | . 699 |
| par_10 | -15.314 | -16.616 | -2.452 | 1.121 | -3.023 | . 300 | -2.817 |
| Critical Ratios for Differences between Parameters (Default model) |  |  |  |  |  |  |  |
| par_8 par_9 par_10 |  |  |  |  |  |  |  |
| par_8 | . 000 |  |  |  |  |  |  |
| par_9 | 2.650 | . 000 |  |  |  |  |  |
| par_10 | -1.077 | -2.884 | . 000 |  |  |  |  |

Example 18

The first two rows and columns, labeled $m 1 \_$yng and $m 1 \_$old, refer to the group means of the vocab test. The critical ratio for the mean difference is 2.901 , according to which the means differ significantly at the 0.05 level; the older population scores higher on the long test than the younger population.

Another test of the hypothesis of equal vocab group means can be obtained by refitting the model with equality constraints imposed on these two means. We will do that next.

## Model B

In Model B, vocab is required to have the same mean for young people as for old people. There are two ways to impose this constraint. One method is to change the names of the means. In Model A, each mean has a unique name. You can change the names and give each mean the same name. This will have the effect of requiring the two mean estimates to be equal.

A different method of constraining the means will be used here. The name of the means, $m 1 \_$yng and $m 1 \_$old, will be left alone. Amos will use its Model Manager to fit both Model A and Model B in a single analysis. To use this approach:

- Start with Model A.
- From the menus, choose Analyze > Manage Models.
- In the Manage Models dialog, type Model A in the Model Name text box.
- Leave the Parameter Constraints box empty.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-8bb4665861.jpg)
- To specify Model B, click New.
- In the Model Name text box, change Model Number 2 to Model B.
- Type $\mathrm{m} 1 \_\mathrm{old}=\mathrm{m} 1 \_\mathrm{yng}$ in the Parameter Constraints text box.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-45dedc1358.jpg)
- Click Close.

A path diagram that fits both Model A and Model B is saved in the file Ex18-b.amw.

## Output from Models A and B

- To see fit measures for both Model A and Model B, click Model Fit in the tree diagram in the upper left pane of the Amos Output window.

The portion of the output that contains chi-square values is shown here:

| CMIN |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Model | NPAR | CMIN | DF | P | CMIN/DF |
| Model A | 10 | . 000 | 0 |  |  |
| Model B | 9 | 7.849 | 1 | . 005 | 7.849 |
| Saturated model | 10 | . 000 | 0 |  |  |
| Independence model | 4 | 33.096 | 6 | . 000 | 5.516 |

If Model B is correct (that is, the young and old populations have the same mean vocab score), then 7.849 is an observation on a random variable that has a chi-square distribution with one degree of freedom. The probability of getting a value as large as 7.849 by chance is small ( $p=0.005$ ), so Model B is rejected. In other words, young and old subjects differ significantly in their mean vocab scores.]

## Modeling in VB.NET

## Model A

The following program fits Model A. It estimates means, variances, and covariances of both vocabulary tests in both groups of subjects, without constraints. The program is saved as Ex18-a.vb.

```
Sub Main()
    Dim Sem As New AmosEngine
    Try
        Sem.TextOutput()
        Sem.Crdiff()
        Sem.ModelMeansAndIntercepts()
        Sem.BeginGroup(Sem.AmosDir & "Examples\atty_mis.sav")
            Sem.GroupName("young_subjects")
            Sem.Mean("vocab", "m1_yng")
            Sem.Mean("v_short")
        Sem.BeginGroup(Sem.AmosDir & "Examples\atto_mis.sav")
            Sem.GroupName("old_subjects")
            Sem.Mean("vocab", "m1_old")
            Sem.Mean("v_short")
        Sem.FitModel()
    Finally
        Sem.Dispose()
    End Try
End Sub
```

The Crdiff method displays the critical ratios for parameter differences that were discussed earlier.

For later reference, note the value of the Function of log likelihood for Model A.

```
Function of log likelihood = 429.963
Number of parameters = 10
```


## Model B

Here is a program for fitting Model B. In this program, the same parameter name (mn_vocab) is used for the vocab mean of the young group as for the vocab mean of the old group. In this way, the young group and old group are required to have the same vocab mean. The program is saved as Ex18-b.vb.

```
Sub Main()
    Dim Sem As New AmosEngine
    Try
        Sem.TextOutput()
        Sem.Crdiff()
        Sem.ModelMeansAndIntercepts()
        Sem.BeginGroup(Sem.AmosDir & "Examples\atty_mis.sav")
            Sem.GroupName("young_subjects")
            Sem.Mean("vocab", "mn_vocab")
            Sem.Mean("v_short")
        Sem.BeginGroup(Sem.AmosDir & "Examples\atto_mis.sav")
            Sem.GroupName("old_subjects")
            Sem.Mean("vocab", "mn_vocab")
            Sem.Mean("v_short")
        Sem.FitModel()
    Finally
        Sem.Dispose()
    End Try
End Sub
```

Amos reports the fit of Model B as:

```
Function of log likelihood = 437.813
Number of parameters = 9
```

The difference in fit measures between Models B and A is $7.85(=437.813-429.963)$, and the difference in the number of parameters is $1(=10-9)$. These are the same figures we obtained earlier with Amos Graphics.

## 19

## Bootstrapping

## Introduction

This example demonstrates how to obtain robust standard error estimates by the bootstrap method.

## The Bootstrap Method

The bootstrap (Efron, 1982) is a versatile method for estimating the sampling distribution of parameter estimates. In particular, the bootstrap can be used to find approximate standard errors. As we saw in earlier examples, Amos automatically displays approximate standard errors for the parameters it estimates. In computing these approximations, Amos uses formulas that depend on the assumptions on p. 36.

The bootstrap is a completely different approach to the problem of estimating standard errors. Why would you want another approach? To begin with, Amos does not have formulas for all of the standard errors you might want, such as standard errors for squared multiple correlations. The unavailability of formulas for standard errors is never a problem with the bootstrap, however. The bootstrap can be used to generate an approximate standard error for every estimate that Amos computes, whether or not a formula for the standard error is known. Even when Amos has formulas for standard errors, the formulas are good only under the assumptions on p. 36. Not only that, but the formulas work only when you are using a correct model. Approximate standard errors arrived at by the bootstrap do not suffer from these limitations.

The bootstrap has its own shortcomings, including the fact that it can require fairly large samples. For readers who are new to bootstrapping, we recommend the Scientific American article by Diaconis and Efron (1983).

The present example demonstrates the bootstrap with a factor analysis model, but, of course, you can use the bootstrap with any model. Incidentally, don't forget that Amos can solve simple estimation problems like the one in Example 1. You might choose to use Amos for such simple problems just so you can use the bootstrapping capability of Amos.

## About the Data

We will use the Holzinger and Swineford (1939) data, introduced in Example 8, for this example. The data are contained in the file Grnt_fem.sav.

## A Factor Analysis Model

The path diagram for this model (Ex19.amw) is the same as in Example 8.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-6220819f19.jpg)
Example 19: Bootstrapping
Holzinger and Swineford (1939) Girls' sample Model Specification

- To request 500 bootstrap replications, from the menus, choose View > Analysis Properties.
- Click the Bootstrap tab.
- Select Perform bootstrap.
- Type 500 in the Number of bootstrap samples text box.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-45040a8d65.jpg)


## Monitoring the Progress of the Bootstrap

You can monitor the progress of the bootstrap algorithm by watching the Computation summary panel at the left of the path diagram.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-171e578297.jpg)

## Results of the Analysis

The model fit is, of course, the same as in Example 8.

```
Chi-square = 7.853
Degrees of freedom = 8
Probability level = 0.448
```

The parameter estimates are also the same as in Example 8. However, we would now like to look at the standard error estimates based on the maximum likelihood theory, so that we can compare them to standard errors obtained from the bootstrap.

Here, then, are the maximum likelihood estimates of parameters and their standard errors:

## Regression Weights: (Group number 1 - Default model)

|  |  | Estimate | S.E. | C.R. | P | Label |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| visperc | <--- spatial | 1.000 |  |  |  |  |
| cubes | <--- spatial | . 610 | . 143 | 4.250 | *** |  |
| lozenges | <--- spatial | 1.198 | . 272 | 4.405 | *** |  |
| paragrap | <--- verbal | 1.000 |  |  |  |  |
| sentence | <--- verbal | 1.334 | . 160 | 8.322 | *** |  |
| wordmean | <--- verbal | 2.234 | . 263 | 8.482 | *** |  |

## Standardized Regression Weights: (Group number 1 Default model)

|  |  | Estimate |  |
| :--- | ---: | ---: | ---: |
| visperc | 飞-- | spatial | .703 |
| cubes | <-- | spatial | .654 |
| lozenges | <-- | spatial | .736 |
| paragrap | <-- | verbal | .880 |
| sentence | <-- | verbal | .827 |
| wordmean <--- | verbal | .841 |  |

## Covariances: (Group number 1 - Default model)

|  | Estimate | S.E. | C.R. | P | Label |
| :--- | ---: | :--- | :--- | ---: | ---: |
| spatial ⟶ --> verbal | 7.315 | 2.571 | 2.846 | .004 |  |

## Correlations: (Group number 1 - Default model)

spatial ⟶ verbal $\quad$ Estimate

## Variances: (Group number 1 - Default model)

|  | Estimate | S.E. | C.R. | P | Label |
| :--- | :--- | :--- | :--- | :--- | :--- |
| spatial | 23.302 | 8.123 | 2.868 | . 004 |  |
| verbal | 9.682 | 2.159 | 4.485 | *** |  |
| err_v | 23.873 | 5.986 | 3.988 | *** |  |
| err_c | 11.602 | 2.584 | 4.490 | *** |  |
| err_l | 28.275 | 7.892 | 3.583 | *** |  |
| err_p | 2.834 | . 868 | 3.263 | 001 |  |
| err_s | 7.967 | 1.869 | 4.263 | *** |  |
| err_w | 19.925 | 4.951 | 4.024 | *** |  |

## Squared Multiple Correlations: (Group number 1 Default model)

|  | Estimate |
| :--- | ---: |
| wordmean | .708 |
| sentence | .684 |
| paragrap | .774 |
| lozenges | .542 |
| cubes | .428 |
| visperc | .494 |

The bootstrap output begins with a table of diagnostic information that is similar to the following:

0 bootstrap samples were unused because of a singular covariance matrix.
0 bootstrap samples were unused because a solution was not found.
500 usable bootstrap samples were obtained.
It is possible that one or more bootstrap samples will have a singular covariance matrix, or that Amos will fail to find a solution for some bootstrap samples. If any such samples occur, Amos reports their occurrence and omits them from the bootstrap analysis. In the present example, no bootstrap sample had a singular covariance matrix, and a solution was found for each of the 500 bootstrap samples. The bootstrap estimates of standard errors are:

## Scalar Estimates (Group number 1 - Default model)

## Regression Weights: (Group number 1 - Default model)

| Parameter | SE | SE-SE | Mean | Bias | SE-Bias |
| :--- | ---: | ---: | ---: | ---: | ---: |
| visperc | ←---spatial | .000 | .000 | 1.000 | .000 |
| cubes | <---spatial | .140 | .004 | .609 | -.001 |
| lozenges | <--spatial | .373 | .012 | 1.216 | .018 |
| paragrap | <---verbal | .000 | .000 | 1.000 | .000 |
| sentence | <--verbal | .176 | .006 | 1.345 | .011 |
| wordmean <--verbal | .254 | .008 | 2.246 | .011 | .000 |
|  |  |  |  |  | .008 |

## Standardized Regression Weights: (Group number 1 Default model)

| Parameter | SE | SE-SE | Mean | Bias | SE-Bias |
| :--- | ---: | ---: | ---: | ---: | ---: |
| visperc | <---spatial | .123 | .004 | .709 | .006 |
| cubes | <---spatial | .101 | .003 | .646 | -.008 |
| lozenges | <--spatial | .121 | .004 | .719 | -.017 |
| paragrap | <---verbal | .047 | .001 | .876 | -.004 |
| sentence <---verbal | .042 | .001 | .826 | .000 | .002 |
| wordmean <---verbal | .050 | .002 | .841 | -.001 | .002 |

## Covariances: (Group number 1 - Default model)

| Parameter | SE | SE-SE | Mean | Bias | SE-Bias |
| :--- | :---: | ---: | ---: | ---: | ---: |
| spatial $<->$ verbal | 2.393 | .076 | 7.241 | -.074 | .107 |

## Correlations: (Group number 1 - Default model)

| Parameter | SE | SE-SE | Mean | Bias | SE-Bias |
| :--- | :--- | ---: | ---: | ---: | ---: |
| spatial $<->$ verbal | .132 | .004 | .495 | .008 | .006 |

## Variances: (Group number 1 - Default model)

| Parameter | SE | SE-SE | Mean | Bias | SE-Bias |
| :--- | :--- | :--- | :--- | :--- | :--- |
| spatial | 9.086 | . 287 | 23.905 | 603 | . 406 |
| verbal | 2.077 | . 066 | 9.518 | -. 164 | . 093 |
| err_v | 9.166 | . 290 | 22.393 | -1.480 | . 410 |
| err_c | 3.195 | . 101 | 11.191 | -. 411 | . 143 |
| err_l | 9.940 | . 314 | 27.797 | -. 478 | . 445 |
| err_p | . 878 | . 028 | 2.772 | -. 062 | . 039 |
| err_s | 1.446 | . 046 | 7.597 | -. 370 | . 065 |
| err_w | 5.488 | . 174 | 19.123 | -. 803 | . 245 |

## Squared Multiple Correlations: (Group number 1 Default model)

| Parameter | SE | SE-SE | Mean | Bias | SE-Bias |
| :--- | :--- | ---: | ---: | ---: | ---: |
| wordmean | .083 | .003 | .709 | .001 | .004 |
| sentence | .069 | .002 | .685 | .001 | .003 |
| paragrap | .081 | .003 | .770 | -.004 | .004 |
| lozenges | .172 | .005 | .532 | -.010 | .008 |
| cubes | .127 | .004 | .428 | .000 | .006 |
| visperc | .182 | .006 | .517 | .023 | .008 |

- The first column, labeled S.E., contains bootstrap estimates of standard errors. These estimates may be compared to the approximate standard error estimates obtained by maximum likelihood.
- The second column, labeled S.E.-S.E., gives an approximate standard error for the bootstrap standard error estimate itself.
- The column labeled Mean represents the average parameter estimate computed across bootstrap samples. This bootstrap mean is not necessarily identical to the original estimate.
- The column labeled Bias gives the difference between the original estimate and the mean of estimates across bootstrap samples. If the mean estimate across bootstrapped samples is higher than the original estimate, then Bias will be positive.
- The last column, labeled S.E.-Bias, gives an approximate standard error for the bias estimate.


## Modeling in VB.NET

The following program (Ex19. vb) fits the model of Example 19 and performs a bootstrap with 500 bootstrap samples. The program is the same as in Example 8, but with an additional Bootstrap line.

```
Sub Main()
    Dim Sem As New AmosEngine
    Try
        Sem.TextOutput()
        Sem.Bootstrap(500)
        Sem.Standardized()
        Sem.Smc()
        Sem.BeginGroup(Sem.AmosDir & "Examples\Grnt_fem.sav")
        Sem.AStructure("visperc = (1) spatial + (1) err_v")
        Sem.AStructure("cubes = spatial + (1) err_c")
        Sem.AStructure("lozenges = spatial + (1) err_l")
        Sem.AStructure("paragrap = (1) verbal + (1) err_p")
        Sem.AStructure("sentence = verbal + (1) err_s")
        Sem.AStructure("wordmean = verbal + (1) err_w")
        Sem.FitModel()
    Finally
        Sem.Dispose()
    End Try
End Sub
```

The line Sem.Bootstrap(500) requests bootstrap standard errors based on 500 bootstrap samples.

## Bootstrapping for Model Comparison

## Introduction

This example demonstrates the use of the bootstrap for model comparison.

## Bootstrap Approach to Model Comparison

The problem addressed by this method is not that of evaluating an individual model in absolute terms but of choosing among two or more competing models. Bollen and Stine (1992), Bollen (1982), and Stine (1989) suggested the possibility of using the bootstrap for model selection in analysis of moment structures. Linhart and Zucchini (1986) described a general schema for bootstrapping and model selection that is appropriate for a large class of models, including structural modeling. The Linhart and Zucchini approach is employed here.

The bootstrap approach to model comparison can be summarized as follows:

- Generate several bootstrap samples by sampling with replacement from the original sample. In other words, the original sample serves as the population for purposes of bootstrap sampling.
- Fit every competing model to every bootstrap sample. After each analysis, calculate the discrepancy between the implied moments obtained from the bootstrap sample and the moments of the bootstrap population.
- Calculate the average (across bootstrap samples) of the discrepancies for each model from the previous step.
- Choose the model whose average discrepancy is smallest.


## About the Data

The present example uses the combined male and female data from the Grant-White high school sample of the Holzinger and Swineford (1939) study, previously discussed in Examples 8, 12, 15, 17, and 19. The 145 combined observations are given in the file Grant.sav.

## Five Models

Five measurement models will be fitted to the six psychological tests. Model 1 is a factor analysis model with one factor.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-86b5616739.jpg)
Example 20: Model 1
One-factor model
Holzinger and Swineford (1939) data
Model Specification

Model 2 is an unrestricted factor analysis with two factors. Note that fixing two of the regression weights at 0 does not constrain the model but serves only to make the model identified (Anderson, 1984; Bollen and Jöreskog, 1985; Jöreskog, 1979).

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-cf52b63b9c.jpg)
Example 20: Model 2
Two unconstrained factors Holzinger and Swineford (1939) data
Model Specification

Model 2R is a restricted factor analysis model with two factors, in which the first three tests depend upon only one of the factors while the remaining three tests depend upon only the other factor.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-adaab885a3.jpg)
Example 20: Model 2R
Restricted two-factor model Holzinger and Swineford (1939) data Model Specification

The remaining two models provide customary points of reference for evaluating the fit of the previous models. In the saturated model, the variances and covariances of the observed variables are unconstrained.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-d3a4cc7c7e.jpg)

Example 20: Saturated model Variances and covariances Holzinger and Swineford (1939) data Model Specification

In the independence model, the variances of the observed variables are unconstrained and their covariances are required to be 0 .
visperc
cubes
lozenges
paragraph
sentence
wordmean
Example 20: Independence model
Only variances are estimated
Holzinger and Swineford (1939) data
Model Specification

You would not ordinarily fit the saturated and independence models separately, since Amos automatically reports fit measures for those two models in the course of every analysis. However, it is necessary to specify explicitly the saturated and independence models in order to get bootstrap results for those models. Five separate bootstrap analyses must be performed, one for each model. For each of the five analyses:

- From the menus, choose View > Analysis Properties.
- In the Analysis Properties dialog, click the Bootstrap tab.
- Select Perform bootstrap (a check mark appears next to it).
- Type 1000 in the Number of bootstrap samples text box.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-652e220295.jpg)
- Click the Random \# tab and enter a value for Seed for random numbers.

It does not matter what seed you choose, but in order to draw the exact same set of samples in each of several Amos sessions, the same seed number must be given each time. For this example, we used a seed of 3 .
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-997a8e153f.jpg)

Occasionally, bootstrap samples are encountered for which the minimization algorithm does not converge. To keep overall computation times in check:

- Click the Numerical tab and limit the number of iterations to a realistic figure (such as 40) in the Iteration limit field.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-63fcfe09d0.jpg)

Amos Graphics input files for the five models have been saved with the names Ex20-1.amw, Ex20-2.amw, Ex20-2r.amw, Ex20-sat.amw, and Ex20-ind.amw.

## Text Output

- In viewing the text output for Model 1, click Summary of Bootstrap Iterations in the tree diagram in the upper left pane of the Amos Output window.

The following message shows that it was not necessary to discard any bootstrap samples. All 1,000 bootstrap samples were used.

> 0 bootstrap samples were unused because of a singular covariance matrix.
> 0 bootstrap samples were unused because a solution was not found.
> 1000 usable bootstrap samples were obtained.

- Click Bootstrap Distributions in the tree diagram to see a histogram of

$$
C_{M L}\left(\hat{\alpha}_{b}, \mathbf{a}\right)=C_{K L}\left(\hat{\alpha}_{b}, \mathbf{a}\right)-C_{K L}(\mathbf{a}, \mathbf{a}), \quad \mathrm{b}=1, \ldots, 1000
$$

where a contains sample moments from the original sample of 145 Grant-White students (that is, the moments in the bootstrap population), and $\hat{\alpha}_{b}$ contains the implied moments obtained from fitting Model 1 to the $b$-th bootstrap sample. Thus, $C_{\mathrm{ML}}\left(\hat{\alpha}_{b}, \mathbf{a}\right)$ is a measure of how much the population moments differ from the moments estimated from the $b$-th bootstrap sample using Model 1.

| ML discrepancy (implied vs pop) (Default model) |  |  |
| :--- | :--- | :--- |
| \|----------------- |  |  |
| N = 1000 <br> Mean = 64.162 <br> S. e. = . 292 | 48.268 <br> \|** <br> 52.091 <br> \|********* <br> 55.913 <br> \|************* <br> 59.735 <br> \|******************* <br> 63.557 <br> \|***************** <br> 67.379 <br> \|************ <br> 71.202 <br> \|******** <br> 75.024 <br> \|****** <br> 78.846 <br> \|*** <br> 82.668 <br> I* <br> 86.490 <br> \|** <br> 90.313 <br> \|** <br> 94.135 <br> I* <br> 97.957 <br> \|* <br> 101.779 <br> I* $\_\_\_\_$ |  |
|  |  |  |
|  |  |  |
|  |  |  |
|  |  |  |
|  |  |  |
|  |  |  |
|  |  |  |
|  |  |  |
|  |  |  |
|  |  |  |
|  |  |  |
|  |  |  |
|  |  |  |
|  |  |  |
|  |  |  |

The average of $C_{\mathrm{ML}}\left(\hat{\alpha}_{b}, \mathbf{a}\right)$ over 1,000 bootstrap samples was 64.162 with a standard error of 0.292 . Similar histograms, along with means and standard errors, are displayed for the other four models but are not reproduced here. The average discrepancies for the five competing models are shown in the table below, along with values of BCC, AIC, and CAIC. The table provides fit measures for five competing models (standard errors in parentheses).

| Model | Failures | Mean Discrepancy | BCC | AIC | CAIC |
| :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | 0 | 64.16 (0.29) | 68.17 | 66.94 | 114.66 |
| 2 | 19 | 29.14 (0.35 | 36.81 | 35.07 | 102.68 |
| 2R | 0 | 26.57 (0.30) | 30.97 | 29.64 | 81.34 |
| Sat. | 0 | 32.05 (0.37) | 44.15 | 42.00 | 125.51 |
| Indep. | 0 | 334.32 (0.24) | 333.93 | 333.32 | 357.18 |

Example 20

The Failures column in the table indicates that the likelihood function of Model 2 could not be maximized for 19 of the 1,000 bootstrap samples, at least not with the iteration limit of 40. Nineteen additional bootstrap samples were generated for Model 2 in order to bring the total number of bootstrap samples to the target of 1,000 . The 19 samples where Model 2 could not be fitted successfully caused no problem with the other four models. Consequently, 981 bootstrap samples were common to all five models.

No attempt was made to find out why Model 2 estimates could not be computed for 19 bootstrap samples. As a rule, algorithms for analysis of moment structures tend to fail for models that fit poorly. If some way could be found to successfully fit Model 2 to these 19 samples-for example, with hand-picked start values or a superior algorithm—it seems likely that the discrepancies would be large. According to this line of reasoning, discarding bootstrap samples for which estimation failed would lead to a downward bias in the mean discrepancy. Thus, you should be concerned by estimation failures during bootstrapping, primarily when they occur for the model with the lowest mean discrepancy.

In this example, the lowest mean discrepancy (26.57) occurs for Model 2R, confirming the model choice based on the BCC, AIC, and CAIC criteria. The differences among the mean discrepancies are large compared to their standard errors. Since all models were fitted to the same bootstrap samples (except for samples where Model 2 was not successfully fitted), you would expect to find positive correlations across bootstrap samples between discrepancies for similar models. Unfortunately, Amos does not report those correlations. Calculating the correlations by hand shows that they are close to 1 , so that standard errors for the differences between means in the table are, on the whole, even smaller than the standard errors of the means.

## Summary

The bootstrap can be a practical aid in model selection for analysis of moment structures. The Linhart and Zucchini (1986) approach uses the expected discrepancy between implied and population moments as the basis for model comparisons. The method is conceptually simple and easy to apply. It does not employ any arbitrary magic number such as a significance level. Of course, the theoretical appropriateness of competing models and the reasonableness of their associated parameter estimates are not taken into account by the bootstrap procedure and need to be given appropriate weight at some other stage in the model evaluation process.

## Modeling in VB.NET

Visual Basic programs for this example are in the files Ex20-1.vb, Ex20-2.vb, Ex202r.vb, Ex20-ind.vb, and Ex20-sat.vb.

## Example <br> 21

## Bootstrapping to Compare Estimation Methods

## Introduction

This example demonstrates how bootstrapping can be used to choose among competing estimation criteria.

## Estimation Methods

The discrepancy between the population moments and the moments implied by a model depends not only on the model but also on the estimation method. The technique used in Example 20 to compare models can be adapted to the comparison of estimation methods. This capability is particularly needed when choosing among estimation methods that are known to be optimal only asymptotically, and whose relative merits in finite samples would be expected to depend on the model, the sample size, and the population distribution. The principal obstacle to carrying out this program for comparing estimation methods is that it requires a prior decision about how to measure the discrepancy between the population moments and the moments implied by the model. There appears to be no way to make this decision without favoring some estimation criteria over others. Of course, if every choice of population discrepancy leads to the same conclusion, questions about which is the appropriate population discrepancy can be considered academic. The present example presents such a clear-cut case.

## About the Data

The Holzinger-Swineford (1939) data from Example 20 (in the file Grant.sav) are used in the present example.

## About the Model

The present example estimates the parameters of Model 2R from Example 20 by four alternative methods: Asymptotically distribution-free (ADF), maximum likelihood (ML), generalized least squares (GLS), and unweighted least squares (ULS). To compare the four estimation methods, you need to run Amos four times.

To specify the estimation method and bootstrap parameters:

- From the menus, choose View > Analysis Properties.
- In the Analysis Properties dialog, click the Random \# tab.
- Enter a Seed for random numbers.

As we discussed in Example 20, it does not matter what seed value you choose, but in order to draw the exact same set of samples in each of several Amos sessions, the same seed number must be given each time. In this example, we use a seed of 3 .
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-bd95590c19.jpg)

- Next, click the Estimation tab.
- Select the Asymptotically distribution-free discrepancy.

This discrepancy specifies that ADF estimation should be used to fit the model to each bootstrap sample.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-48d45a0903.jpg)

- Finally, click the Bootstrap tab.
- Select Perform bootstrap and type 1000 for Number of bootstrap samples.
- Select Bootstrap ADF, Bootstrap ML, Bootstrap GLS, and Bootstrap ULS.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-ccb3933a84.jpg)

Selecting Bootstrap ADF, Bootstrap ML, Bootstrap GLS, Bootstrap SLS, and Bootstrap ULS specifies that each of $\mathrm{C}_{\mathrm{ADF}}, \mathrm{C}_{\mathrm{ML}}, \mathrm{C}_{\mathrm{GLS}}$, and $\mathrm{C}_{\text {ULS }}$ is to be used to measure the discrepancy between the sample moments in the original sample and the implied moments from each bootstrap sample.

To summarize, when you perform the analysis (Analyze > Calculate Estimates), Amos will fit the model to each of 1,000 bootstrap samples using the ADF discrepancy. For each bootstrap sample, the closeness of the implied moments to the population moments will be measured four different ways, using $\mathrm{C}_{\mathrm{ADF}}, \mathrm{C}_{\mathrm{ML}}, \mathrm{C}_{\mathrm{GLS}}$, and $\mathrm{C}_{\text {ULS }}$.

- Select the Maximum likelihood discrepancy to repeat the analysis.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f00aeb6ca1.jpg)
- Select the Generalized least squares discrepancy to repeat the analysis again.
- Select the Unweighted least squares discrepancy to repeat the analysis one last time. The four Amos Graphics input files for this example are Ex21-adf.amw, Ex21-ml.amw, Ex21-gls.amw, and Ex21-uls.amw.


## Text Output

In the first of the four analyses (as found in Ex21-adf.amw), estimation using ADF produces the following histogram output. To view this histogram:

- Click Bootstrap Distributions > ADF Discrepancy (implied vs pop) in the tree diagram in the upper left pane of the Amos Output window.

| ADF discrepancy (implied vs pop) (Default model) |  |  |
| :--- | :--- | :--- |
|  |  |  |
| 7.359 |  | \|* |
|  | 10.817 | ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-697d73f176.jpg) |
|  | 14.274 | $\_\_\_\_$ |
|  | 17.732 | \|******************* |
|  | 21.189 | \|****************** |
|  | 24.647 | \|************* |
|  | 28.104 | \|******** |
| N = 1000 | 31.562 | \|**** |
| Mean = 20.601 | 35.019 | \|** |
| S. e. = . 218 | 38.477 | \|** |
|  | 41.934 | \|* |
|  | 45.392 | \|* |
|  | 48.850 | I* |
|  | 52.307 | \|* |
|  | 55.765 | \|* |
|  |  | \|------------------ |

This portion of the output shows the distribution of the population discrepancy $C_{\text {ADF }}\left(\hat{\alpha}_{b}, \mathbf{a}\right)$ across 1,000 bootstrap samples, where $\hat{\alpha}_{b}$ contains the implied moments obtained by minimizing $C_{\text {ADF }}\left(\hat{\alpha}_{b}, \mathbf{a}_{b}\right)$, that is, the sample discrepancy. The average of $C_{\mathrm{ADF}}\left(\hat{\alpha}_{b}, \mathbf{a}\right)$ across 1,000 bootstrap samples is 20.601, with a standard error of 0.218.

The following histogram shows the distribution of $C_{\mathrm{ML}}\left(\hat{\alpha}_{b}, \mathbf{a}\right)$. To view this histogram:

- Click Bootstrap Distributions $>$ ML Discrepancy (implied vs pop) in the tree diagram in the upper left pane of the Amos Output window.

| ML discrepancy (implied vs pop) (Default model) |  |  |
| :--- | :--- | :--- |
| 11.272 \|**** |  |  |
| 22.691 \|******************** |  |  |
| 34.110 \|******************** |  |  |
| 45.530 \|*********** |  |  |
| 56.949 \|***** |  |  |
| 68.368 \|*** |  |  |
| 79.787 \|** |  |  |
| N = 1000 | 91.207 \|* |  |
| Mean $=36.860 \quad 102.626 \quad$ \|* |  |  |
| S. e. $=.571$ <br> 114.045 \|* |  |  |
| 125.464 \|* |  |  |
| 136.884 |  |  |
| 148.303 |  |  |
| 159.722 |  |  |
| 171.142 |  | I* |
|  |  | \|------------------- |

The following histogram shows the distribution of $C_{\mathrm{GLS}}\left(\hat{\alpha}_{b}, \mathbf{a}\right)$. To view this histogram:

- Click Bootstrap Distributions > GLS Discrepancy (implied vs pop) in the tree diagram in the upper left pane of the Amos Output window.

| GLS discrepancy (implied vs pop) (Default model) |  |  |
| :--- | :--- | :--- |
| \|---------------- |  |  |
| 7.248 <smiles>[IH+][IH+]</smiles> <br> \|** |  |  |
|  |  |  |
| 14.904 |  |  |
|  |  |  |
|  | 22.561 | \|************** |
| N = 1000 | 26.389 | \|*********** |
|  | 30.217 | \|******* |
|  | 34.046 | \|**** |
| Mean = 21.827 | 37.874 | \|** |
| S. e. = . 263 | 41.702 | \|*** |
|  | 45.530 | \|* |
|  | 49.359 | \|* |
|  | 53.187 | I* |
|  | 57.015 | $\mathrm{I}^{*}$ |
| 60.844 |  | \|* |
|  |  | \|------------------ |

The following histogram shows the distribution of $C_{\text {ULS }}\left(\hat{\alpha}_{b}, \mathbf{a}\right)$. To view this histogram:

- Click Bootstrap Distributions > ULS Discrepancy (implied vs pop) in the tree diagram in the upper left pane of the Amos Output window.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-893e777a6f.jpg)

Below is a table showing the mean of $C\left(\hat{\alpha}_{b}, \mathbf{a}\right)$ across 1,000 bootstrap samples with the standard errors in parentheses. The four distributions just displayed are summarized in the first row of the table. The remaining three rows show the results of estimation by minimizing $C_{\mathrm{ML}}, C_{\mathrm{GLS}}$, and $C_{\mathrm{ULS}}$, respectively.

|  |  | Population discrepancy for evaluation: $C\left(\hat{\alpha}_{b}, \mathbf{a}_{b}\right)$ |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- |
|  |  | $C_{\mathrm{ADF}}$ | $C_{\mathrm{ML}}$ | $C_{\mathrm{GLS}}$ | $C_{\mathrm{ULS}}$ |
| Sample <br> discrepancy <br> for estimation <br> $C\left(\hat{\alpha}_{b}, \mathbf{a}_{b}\right)$ | $C_{\mathrm{ADF}}$ | $20.60(0.22)$ | $36.86(0.57)$ | $21.83(0.26)$ | $43686(1012)$ |
|  | $C_{\mathrm{ML}}$ | $19.19(0.20)$ | $26.57(0.30)$ | $18.96(0.22)$ | $34760(830)$ |
|  | $C_{\mathrm{GLS}}$ | $19.45(0.20)$ | $31.45(0.40)$ | $19.03(0.21)$ | $37021(830)$ |
|  | $C_{\mathrm{ULS}}$ | $24.89(0.35)$ | $31.78(0.43)$ | $24.16(0.33)$ | $35343(793)$ |

The first column, labeled $C_{\mathrm{ADF}}$, shows the relative performance of the four estimation methods according to the population discrepancy, $C_{\mathrm{ADF}}$. Since 19.19 is the smallest mean discrepancy in the $C_{\mathrm{ADF}}$ column, $C_{\mathrm{ML}}$ is the best estimation method according to the $C_{\mathrm{ADF}}$ criterion. Similarly, examining the $C_{\mathrm{ML}}$ column of the table shows that $C_{\mathrm{ML}}$ is the best estimation method according to the $C_{\mathrm{ML}}$ criterion.

Although the four columns of the table disagree on the exact ordering of the four estimation methods, ML is, in all cases, the method with the lowest mean discrepancy. The difference between ML estimation and GLS estimation is slight in some cases.

## Bootstrapping to Compare Estimation Methods

Unsurprisingly, ULS estimation performed badly, according to all of the population discrepancies employed. More interesting is the poor performance of ADF estimation, indicating that ADF estimation is unsuited to this combination of model, population, and sample size.

## Modeling in VB.NET

Visual Basic programs for this example are in the files Ex21-adf.vb, Ex21-gls.vb, Ex21ml.vb, and Ex21-uls.vb.

## Example <br> 22

## Specification Search

## Introduction

This example takes you through two specification searches: one is largely confirmatory (with few optional arrows), and the other is largely exploratory (with many optional arrows).

## Ahout the Data

This example uses the Felson and Bohrnstedt (1979) girls' data, also used in Example 7.

## About the Model

The initial model for the specification search comes from Felson and Bohrnstedt (1979), as seen in Figure 22-1:

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-b5b74bdee7.jpg)
Figure 22-1: Felson and Bohrnstedt's model for girls

## Specification Search with Few Optional Arrows

Felson and Bohrnstedt were primarily interested in the two single-headed arrows, academic ⟵ attract and attract ⟵ academic. The question was whether one or both, or possibly neither, of the arrows was needed. For this reason, you will make both arrows optional during this specification search. The double-headed arrow connecting error1 and error 2 is an undesirable feature of the model because it complicates the interpretation of the effects represented by the single-headed arrows, and so you will also make it optional. The specification search will help to decide which of these three optional arrows, if any, are essential to the model.

This specification search is largely confirmatory because most arrows are required by the model, and only three are optional.

## Specifying the Model

- Open %examples%\Ex22a.amw.

The path diagram opens in the drawing area. Initially, there are no optional arrows, as seen in Figure 22-1.

- From the menus, choose Analyze > Specification Search.

The Specification Search window appears. Initially, only the toolbar is visible.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-4ae3a4b4b7.jpg)

- Click ---- on the Specification Search toolbar, and then click the double-headed arrow that connects error 1 and error 2 . The arrow changes color to indicate that the arrow is optional.

Tip: If you want the optional arrow to be dashed as well as colored, as seen below, choose View → Interface Properties from the menus, click the Accessibility tab, and select the Alternative to color check box.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-3d1cbb4bd4.jpg)

- To make the arrow required again, click □ on the Specification Search toolbar, and then click the arrow. When you move the pointer away, the arrow will again display as a required arrow.


## Example 22

- Click ---- again, and then click the arrows in the path diagram until it looks like this:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-bab3b18e68.jpg)

When you perform the exploratory analysis later on, the program will treat the three colored arrows as optional and will try to fit the model using every possible subset of them.

## Selecting Program Options

- Click the Options button on the Specification Search toolbar.
- In the Options dialog, click the Current results tab.
- Click Reset to ensure that your options are the same as those used in this example.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-a90dbf405a.jpg)
- Now click the Next search tab. The text at the top indicates that the exploratory analysis will fit eight (that is, $2^{3}$ ) models.
- In the Retain only the best $\_\_\_\_$ models box, change the value from 10 to 0 .

With a default value of 10 , the specification search reports at most 10 one-parameter models, at most 10 two-parameter models, and so on. If the value is set to 0 , there is no limitation on the number of models reported.

Limiting the number of models reported can speed up a specification search significantly. However, only eight models in total will be encountered during the specification search for this example, and specifying a nonzero value for Retain only the best $\_\_\_\_$ models would have the undesirable side effect of inhibiting the program from normalizing Akaike weights and Bayes factors so that they sum to 1 across all models, as seen later.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-7513fbe884.jpg)

- Close the Options dialog.


## Performing the Specification Search

- Click on the Specification Search toolbar.

The program fits the model eight times, using every subset of the optional arrows. When it finishes, the Specification Search window expands to show the results.

The following table summarizes fit measures for the eight models and the saturated model:

| Model | Params | df | C | C-df | $\mathrm{BCC}_{0}$ | $\mathrm{BIC}_{0}$ | C/df | p | Notes |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | 19 | 2 | 2.761 | 0.761 | 3.830 | 10.375 | 1.381 | 0.251 |  |
| 2 | 18 | 3 | 19.155 | 16.155 | 18.154 | 21.427 | 6.385 | 0.000 |  |
| 3 | 17 | 4 | 19.215 | 15.215 | 16.144 | 16.144 | 4.804 | 0.001 |  |
| 4 | $\underline{16}$ | $\underline{5}$ | 67.342 | 62.342 | 62.201 | 58.929 | 13.468 | 0.000 |  |
| 5 | 17 | 4 | 27.911 | 23.911 | 24.840 | 24.840 | 6.978 | 0.000 |  |
| 6 | 18 | 3 | 2.763 | -0.237 | 1.761 | 5.034 | 0.921 | 0.430 |  |
| 7 | 17 | 4 | 3.071 | $\underline{-0.929}$ | $\underline{0.000}$ | $\underline{0.000}$ | $\underline{0.768}$ | $\underline{0.546}$ |  |
| 8 | 18 | 3 | 2.895 | -0.105 | 1.894 | 5.167 | 0.965 | 0.408 |  |
| Sat | 21 | 0 | $\underline{0.000}$ | 0.000 | 5.208 | 18.299 |  |  |  |

The Model column contains an arbitrary index number from 1 through 8 for each of the models fitted during the specification search. Sat identifies the saturated model. Looking at the first row, Model 1 has 19 parameters and 2 degrees of freedom. The discrepancy function (which in this case is the likelihood ratio chi-square statistic) is 2.761. Elsewhere in Amos output, the minimum value of the discrepancy function is referred to as CMIN. Here it is labeled $C$ for brevity. To get an explanation of any column of the table, right-click anywhere in the column and choose What's This? from the pop-up menu.

Notice that the best value in each column is underlined, except for the Model and Notes columns.

Many familiar fit measures ( $C F I$ and $R M S E A$, for example) are omitted from this table. Appendix E gives a rationale for the choice of fit measures displayed.

## Viewing Generated Models

- You can double-click any row in the table (other than the Sat row) to see the corresponding path diagram in the drawing area. For example, double-click the row for Model 7 to see its path diagram.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-642482e148.jpg)
Figure 22-2: Path diagram for Model 7

## Viewing Parameter Estimates for a Model

- Click $Y$ on the Specification Search toolbar.
- In the Specification Search window, double-click the row for Model 7.

The drawing area displays the parameter estimates for Model 7.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-1e58e86a93.jpg)
Figure 22-3: Parameter estimates for Model 7

## Using BCC to Compare Models

- In the Specification Search window, click the column heading $\mathrm{BCC}_{0}$.

The table sorts according to $B C C$ so that the best model according to $B C C$ (that is, the model with the smallest $B C C$ ) is at the top of the list.

| Model | Params | df | C | C-df | $\mathbf{B C C}_{\mathbf{0}}$ | $\mathrm{BIC}_{0}$ | c/df | p | Notes |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 7 | 17 | 4 | 3.071 | -0.929 | $\underline{0.000}$ | $\underline{0.000}$ | $\underline{0.768}$ | $\underline{0.546}$ |  |
| 6 | 18 | 3 | 2.763 | -0.237 | 1.761 | 5.034 | 0.921 | 0.430 |  |
| 8 | 18 | 3 | 2.895 | -0.105 | 1.894 | 5.167 | 0.965 | 0.408 |  |
| 1 | 19 | 2 | 2.761 | 0.761 | 3.830 | 10.375 | 1.381 | 0.251 |  |
| Sat | 21 | 0 | $\underline{0.000}$ | 0.000 | 5.208 | 18.299 |  |  |  |
| 3 | 17 | 4 | 19.215 | 15.215 | 16.144 | 16.144 | 4.804 | 0.001 |  |
| 2 | 18 | 3 | 19.155 | 16.155 | 18.154 | 21.427 | 6.385 | 0.000 |  |
| 5 | 17 | 4 | 27.911 | 23.911 | 24.840 | 24.840 | 6.978 | 0.000 |  |
| 4 | $\underline{16}$ | $\underline{5}$ | 67.342 | 62.342 | 62.201 | 58.929 | 13.468 | 0.000 |  |

Based on a suggestion by Burnham and Anderson (1998), a constant has been added to all the $B C C$ values so that the smallest $B C C$ value is 0 . The 0 subscript on $B C C_{0}$ serves as a reminder of this rescaling. AIC (not shown in the above figure) and BIC have been similarly rescaled. As a rough guideline, Burnham and Anderson (1998, p. 128) suggest the following interpretation of $A I C_{0} . B C C_{0}$ can be interpreted similarly.

| $\mathbf{A I C}_{\mathbf{0}}$ or $\mathbf{B C C}_{\mathbf{0}}$ | Burnham and Anderson interpretation |
| :--- | :--- |
| 0-2 | There is no credible evidence that the model should be ruled out as being the actual $K-L$ best model for the population of possible samples. (See Burnham and Anderson for the definition of $K-L$ best.) |
| 2-4 | There is weak evidence that the model is not the $K-L$ best model. |
| 4-7 | There is definite evidence that the model is not the $K-L$ best model. |
| 7-10 | There is strong evidence that the model is not the $K-L$ best model. |
| >10 | There is very strong evidence that the model is not the $K-L$ best model. |

Although Model 7 is estimated to be the best model according to Burnham and Anderson's guidelines, Models 6 and 8 should not be ruled out.

## Viewing the Akaike Weights

- Click the Options button $\boxed{\square}$ on the Specification Search toolbar.
- In the Options dialog, click the Current results tab.
- In the BCC, AIC, BIC group, select Akaike weights / Bayes factors (sum = 1).

| Options |  |  | x |
| :--- | :--- | :--- | :--- |
| Current results \| Next search \| Appearance |  |  |  |
| Display |  |  |  |
| ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-713743e8b8.jpg) <br> Ignore inadmissibility and instability |  | $\square$ Model number <br> $\square$ Model name <br> $\square$ |  |
| ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-98267d93f4.jpg) |  | □ |  |
|  |  |  |  |
| ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-70292acb78.jpg) <br> Show null models |  |  | ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-22eda9b48f.jpg) |
|  |  | " □ <br> Reset |  |
|  |  |  |
| □ |  |  |  |  |

In the table of fit measures, the column that was labeled $B C C_{0}$ is now labeled $B C C_{\mathrm{p}}$ and contains Akaike weights. (See Appendix G.)

| Model | Params | df | C | C-df | BCCp | BICp | C/df | p | Notes |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 7 | 17 | 4 | 3.071 | -0.929 | 0.494 | $\underline{0.860}$ | 0.768 | $\underline{0.546}$ |  |
| 6 | 18 | 3 | 2.763 | -0.237 | 0.205 | 0.069 | 0.921 | 0.430 |  |
| 8 | 18 | 3 | 2.895 | -0.105 | 0.192 | 0.065 | 0.965 | 0.408 |  |
| 1 | 19 | 2 | 2.761 | 0.761 | 0.073 | 0.005 | 1.381 | 0.251 |  |
| Sat | 21 | 0 | $\underline{0.000}$ | 0.000 | 0.037 | 0.000 |  |  |  |
| 3 | 17 | 4 | 19.215 | 15.215 | 0.000 | 0.000 | 4.804 | 0.001 |  |
| 2 | 18 | 3 | 19.155 | 16.155 | 0.000 | 0.000 | 6.385 | 0.000 |  |
| 5 | 17 | 4 | 27.911 | 23.911 | 0.000 | 0.000 | 6.978 | 0.000 |  |
| 4 | $\underline{16}$ | $\underline{5}$ | 67.342 | 62.342 | 0.000 | 0.000 | 13.468 | 0.000 |  |

Example 22

The Akaike weight has been interpreted (Akaike, 1978; Bozdogan, 1987; Burnham and Anderson, 1998) as the likelihood of the model given the data. With this interpretation, the estimated $K-L$ best model (Model 7) is only about 2.4 times more likely ( 0.494 / $0.205=2.41$ ) than Model 6. Bozdogan (1987) points out that, if it is possible to assign prior probabilities to the candidate models, the prior probabilities can be used together with the Akaike weights (interpreted as model likelihoods) to obtain posterior probabilities. With equal prior probabilities, the Akaike weights are themselves posterior probabilities, so that one can say that Model 7 is the $K-L$ best model with probability 0.494 , Model 6 is the $K-L$ best model with probability 0.205 , and so on. The four most probable models are Models 7, 6, 8, and 1. After adding their probabilities $(0.494+0.205+0.192+0.073=0.96)$, one can say that there is a $96 %$ chance that the $K-L$ best model is among those four. (Burnham and Anderson, 1998, pp. 127-129). The $p$ subscript on $B C C_{\mathrm{p}}$ serves as a reminder that $B C C_{\mathrm{p}}$ can be interpreted as a probability under some circumstances.

## Using BIC to Compare Models

- On the Current results tab of the Options dialog, select Zero-based ( $\min =0$ ) in the BCC, AIC, BIC group.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-5095b9fcfe.jpg)
- In the Specification Search window, click the column heading $\mathrm{BIC}_{0}$.

The table is now sorted according to BIC so that the best model according to BIC (that is, the model with the smallest $B I C$ ) is at the top of the list.

| Model | Params | df | C | C-df | $\mathrm{BCC}_{0}$ | $\mathbf{B I C}_{\mathbf{0}}$ | C/df | p | Notes |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 7 | 17 | 4 | 3.071 | $\underline{-0.929}$ | $\underline{0.000}$ | $\underline{0.000}$ | $\underline{0.768}$ | $\underline{0.546}$ |  |
| 6 | 18 | 3 | 2.763 | -0.237 | 1.761 | 5.034 | 0.921 | 0.430 |  |
| 8 | 18 | 3 | 2.895 | -0.105 | 1.894 | 5.167 | 0.965 | 0.408 |  |
| 1 | 19 | 2 | 2.761 | 0.761 | 3.830 | 10.375 | 1.381 | 0.251 |  |
| 3 | 17 | 4 | 19.215 | 15.215 | 16.144 | 16.144 | 4.804 | 0.001 |  |
| Sat | 21 | 0 | $\underline{0.000}$ | 0.000 | 5.208 | 18.299 |  |  |  |
| 2 | 18 | 3 | 19.155 | 16.155 | 18.154 | 21.427 | 6.385 | 0.000 |  |
| 5 | 17 | 4 | 27.911 | 23.911 | 24.840 | 24.840 | 6.978 | 0.000 |  |
| 4 | $\underline{16}$ | 5 | 67.342 | 62.342 | 62.201 | 58.929 | 13.468 | 0.000 |  |

Model 7, with the smallest $B I C$, is the model with the highest approximate posterior probability (using equal prior probabilities for the models and using a particular prior distribution for the parameters of each separate model). Raftery (1995) suggests the following interpretation of $B I C_{0}$ values in judging the evidence for Model 7 against a competing model:

| BIC $_{\mathbf{0}}$ | Raftery (1995) interpretation |
| :--- | :--- |
| $0-2$ | Weak |
| $2-6$ | Positive |
| $6-10$ | Strong |
| $>10$ | Very strong |

Using these guidelines, you have positive evidence against Models 6 and 8, and very strong evidence against all of the other models as compared to Model 7.

## Using Bayes Factors to Compare Models

- On the Current results tab of the Options dialog, select Akaike weights / Bayes factors (sum = 1) in the BCC, AIC, BIC group.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-15ae313b8e.jpg)

In the table of fit measures, the column that was labeled $B I C_{0}$ is now labeled $B I C_{\mathrm{p}}$ and contains Bayes factors scaled so that they sum to 1 .

| Model | Params | df | C | C-df | BCC ${ }_{p}$ | $\mathbf{B I C}_{\mathbf{p}}$ | C/df | p | Notes |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 7 | 17 | 4 | 3.071 | -0.929 | $\underline{0.494}$ | $\underline{0.860}$ | $\underline{0.768}$ | $\underline{0.546}$ |  |
| 6 | 18 | 3 | 2.763 | -0.237 | 0.205 | 0.069 | 0.921 | 0.430 |  |
| 8 | 18 | 3 | 2.895 | -0.105 | 0.192 | 0.065 | 0.965 | 0.408 |  |
| 1 | 19 | 2 | 2.761 | 0.761 | 0.073 | 0.005 | 1.381 | 0.251 |  |
| 3 | 17 | 4 | 19.215 | 15.215 | 0.000 | 0.000 | 4.804 | 0.001 |  |
| Sat | 21 | 0 | $\underline{0.000}$ | 0.000 | 0.037 | 0.000 |  |  |  |
| 2 | 18 | 3 | 19.155 | 16.155 | 0.000 | 0.000 | 6.385 | 0.000 |  |
| 5 | 17 | 4 | 27.911 | 23.911 | 0.000 | 0.000 | 6.978 | 0.000 |  |
| 4 | 16 | 5 | 67.342 | 62.342 | 0.000 | 0.000 | 13.468 | 0.000 |  |

## Specification Search

With equal prior probabilities for the models and using a particular prior distribution of the parameters of each separate model (Raftery, 1995; Schwarz, 1978), BIC ${ }_{\mathrm{p}}$ values are approximate posterior probabilities. Model 7 is the correct model with probability 0.860 . One can be $99 %$ sure that the correct model is among Models 7, 6, and 8 ( 0.860 $+0.069+0.065=0.99$ ). The $p$ subscript is a reminder that $B I C_{\mathrm{p}}$ values can be interpreted as probabilities.

Madigan and Raftery (1994) suggest that only models in Occam's window be used for purposes of model averaging (a topic not discussed here). The symmetric Occam's window is the subset of models obtained by excluding models that are much less probable (Madigan and Raftery suggest something like 20 times less probable) than the most probable model. In this example, the symmetric Occam's window contains models 7,6 , and 8 because these are the models whose probabilities ( $B I C_{\mathrm{p}}$ values) are greater than $0.860 / 20=0.043$.

## Rescaling the Bayes Factors

- On the Current results tab of the Options dialog, select Akaike weights / Bayes factors $(\max =1)$ in the BCC, AIC, BIC group.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-829021fc0e.jpg)

Example 22

In the table of fit measures, the column that was labeled $B I C_{\mathrm{p}}$ is now labeled $B I C_{\mathrm{L}}$ and contains Bayes factors scaled so that the largest value is 1 . This makes it easier to pick out Occam's window. It consists of models whose $B I C_{\mathrm{L}}$ values are greater than $1 / 20=0.05$; in other words, Models 7,6 , and 8 . The $L$ subscript on $B I C_{\mathrm{L}}$ is a reminder that the analogous statistic $B C C_{\mathrm{L}}$ can be interpreted as a likelihood.

| Model | Params | df | C | C-df | BCCL | BIC ${ }_{\mathrm{L}}$ | C/df | p | Notes |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 7 | 17 | 4 | 3.071 | -0.929 | $\underline{1.000}$ | $\underline{1.000}$ | $\underline{0.768}$ | $\underline{0.546}$ |  |
| 6 | 18 | 3 | 2.763 | -0.237 | 0.414 | 0.081 | 0.921 | 0.430 |  |
| 8 | 18 | 3 | 2.895 | -0.105 | 0.388 | 0.076 | 0.965 | 0.408 |  |
| 1 | 19 | 2 | 2.761 | 0.761 | 0.147 | 0.006 | 1.381 | 0.251 |  |
| 3 | 17 | 4 | 19.215 | 15.215 | 0.000 | 0.000 | 4.804 | 0.001 |  |
| Sat | 21 | 0 | $\underline{0.000}$ | 0.000 | 0.074 | 0.000 |  |  |  |
| 2 | 18 | 3 | 19.155 | 16.155 | 0.000 | 0.000 | 6.385 | 0.000 |  |
| 5 | 17 | 4 | 27.911 | 23.911 | 0.000 | 0.000 | 6.978 | 0.000 |  |
| 4 | $\underline{16}$ | 5 | 67.342 | 62.342 | 0.000 | 0.000 | 13.468 | 0.000 |  |

## Examining the Short List of Models

- Click on the Specification Search toolbar. This displays a short list of models.

In the figure below, the short list shows the best model for each number of parameters. It shows the best 16 -parameter model, the best 17 -parameter model, and so on. Notice that all criteria agree on the best model when the comparison is restricted to models with a fixed number of parameters. The overall best model must be on this list, no matter which criterion is employed.

| Model | Params | df | C | C-df | $\mathrm{BCC}_{\mathrm{L}}$ | $\mathrm{BIC}_{\mathrm{L}}$ | C/df | p | Notes |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 4 | $\underline{16}$ | 5 | 67.342 | 62.342 | 0.000 | 0.000 | 13.468 | 0.000 |  |
| 7 | 17 | 4 | 3.071 | $\underline{-0.929}$ | 1.000 | 1.000 | $\underline{0.768}$ | $\underline{0.546}$ |  |
| 6 | 18 | 3 | 2.763 | -0.237 | 0.414 | 0.081 | 0.921 | 0.430 |  |
| 1 | 19 | 2 | 2.761 | 0.761 | 0.147 | 0.006 | 1.381 | 0.251 |  |
| Sat | 21 | 0 | $\underline{0.000}$ | 0.000 | 0.074 | 0.000 |  |  |  |

Figure 22-4: The best model for each number of parameters

This table shows that the best 17 -parameter model fits substantially better than the best 16-parameter model. Beyond 17 parameters, adding additional parameters yields relatively small improvements in fit. In a cost-benefit analysis, stepping from 16 parameters to 17 parameters has a relatively large payoff, while going beyond 17 parameters has a relatively small payoff. This suggests adopting the best 17-parameter model, using a heuristic point of diminishing returns argument. This approach to
determining the number of parameters is pursued further later in this example (see "Viewing the Best-Fit Graph for C" on p. 356 and "Viewing the Scree Plot for C" on p. 359).

## Viewing a Scatterplot of Fit and Complexity

- Click on the Specification Search toolbar. This opens the Plot window, which displays the following graph:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-6d838a6452.jpg)

The graph shows a scatterplot of fit (measured by C) versus complexity (measured by the number of parameters) where each point represents a model. The graph portrays the trade-off between fit and complexity that Steiger characterized as follows:

In the final analysis, it may be, in a sense, impossible to define one best way to combine measures of complexity and measures of badness-of-fit in a single numerical index, because the precise nature of the best numerical trade-off between complexity and fit is, to some extent, a matter of personal taste. The choice of a model is a classic problem in the two-dimensional analysis of preference. (Steiger, 1990, p. 179.)

- Click any of the points in the scatterplot to display a menu that indicates which models are represented by that point and any overlapping points.
- Choose one of the models from the pop-up menu to see that model highlighted in the table of model fit statistics and, at the same time, to see the path diagram of that model in the drawing area.

In the following figure, the cursor points to two overlapping points that represent Model 6 (with a discrepancy of 2.76) and Model 8 (with a discrepancy of 2.90).
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-02e9e06134.jpg)

The graph contains a horizontal line representing points for which $C$ is constant. Initially, the line is centered at 0 on the vertical axis. The Fit values panel at the lower left shows that, for points on the horizontal line, $C=0$ and also $F=0$. ( $F$ is referred to as $F M I N$ in Amos output.) $N F I_{1}$ and $N F I_{2}$ are two versions of NFI that use two different baseline models (see Appendix F).

Initially, both $N F I_{1}$ and $N F I_{2}$ are equal to 1 for points on the horizontal line. The location of the horizontal line is adjustable. You can move the line by dragging it with the mouse. As you move the line, you can see the changes in the location of the line reflected in the fit measures in the lower left panel.

## Adjusting the Line Representing Constant Fit

Move your mouse over the adjustable line. When the pointer changes to a hand, drag the line so that $N F I_{1}$ is equal to 0.900 . (Keep an eye on $\mathrm{NFI}_{1}$ in the lower left panel while you reposition the adjustable line.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f0f07e41c7.jpg)
$N F I_{1}$ is the familiar form of the NFI statistic for which the baseline model requires the observed variables to be uncorrelated without constraining their means and variances. Points that are below the line have $N F I_{1}>0.900$ and those above the line have $N F I_{1}<0.900$. That is, the adjustable line separates the acceptable models from the unacceptable ones according to a widely used convention based on a remark by Bentler and Bonett (1980).

## Viewing the Line Representing Constant C-df

In the Plot window, select C - df in the Fit measure group. This displays the following:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-882697ecbd.jpg)

The scatterplot remains unchanged except for the position of the adjustable line. The adjustable line now contains points for which $C-d f$ is constant. Whereas the line was previously horizontal, it is now tilted downward, indicating that $C-d f$ gives some weight to complexity in assessing model adequacy. Initially, the adjustable line passes through the point for which $C-d f$ is smallest.

- Click that point, and then choose Model 7 from the pop-up menu.

This highlights Model 7 in the table of fit measures and also displays the path diagram for Model 7 in the drawing area.

The panel in the lower left corner shows the value of some fit measures that depend only on $C-d f$ and that are therefore, like $C-d f$ itself, constant along the adjustable line. $C F I_{1}$ and $C F I_{2}$ are two versions of $C F I$ that use two different baseline models (see Appendix G). Initially, both $C F I_{1}$ and $C F I_{2}$ are equal to 1 for points on the adjustable line. When you move the adjustable line, the fit measures in the lower left panel change to reflect the changing position of the line.

## Adjusting the Line Representing Constant C-df

Drag the adjustable line so that $C F I_{1}$ is equal to 0.950 .
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-fb4e5c5987.jpg)
$C F I_{1}$ is the usual CFI statistic for which the baseline model requires the observed variables to be uncorrelated without constraining their means and variances. Points that are below the line have $C F I_{1}>0.950$ and those above the line have $C F I_{1}<0.950$. That is, the adjustable line separates the acceptable models from the unacceptable ones according to the recommendation of Hu and Bentler (1999).

## Viewing Other Lines Representing Constant Fit

- Click AIC, BCC, and BIC in turn.

Notice that the slope of the adjustable line becomes increasingly negative. This reflects the fact that the five measures ( $C, C-d f, A I C, B C C$, and $B I C$ ) give increasing weight to model complexity. For each of these five measures, the adjustable line has constant slope, which you can confirm by dragging the line with the mouse. By contrast, the slope of the adjustable line for $C / d f$ is not constant (the slope of the line changes when you drag it with the mouse) and so the slope for $C / d f$ cannot be compared to the slopes for $C, C-d f, A I C, B C C$, and $B I C$.

## Viewing the Best-Fit Graph for C

In the Plot window, select Best fit in the Plot type group.

- In the Fit measure group, select C.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-b74fe67dc7.jpg)
Figure 22-5: Smallest value of C for each number of parameters

Each point in this graph represents a model for which $C$ is less than or equal to that of any other model that has the same number of parameters. The graph shows that the best 16-parameter model has $C=67.342$, the best 17-parameter model has $C=3.071$, and so on. While Best fit is selected, the table of fit measures shows the best model for each number of parameters. This table appeared earlier on p. 350.

| Model | Params | df | C | C -df | BCC L | $\mathrm{BIC} \mathrm{C}_{\mathrm{L}}$ | $\mathrm{C} / \mathrm{df}$ | p | Notes |
| ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| 4 | $\underline{16}$ | $\underline{5}$ | 67.342 | 62.342 | 0.000 | 0.000 | 13.468 | 0.000 |  |
| 7 | 17 | 4 | 3.071 | $\underline{-0.929}$ | $\underline{1.000}$ | $\underline{1.000}$ | $\underline{0.768}$ | $\underline{0.546}$ |  |
| 6 | 18 | 3 | 2.763 | $\underline{-0.237}$ | 0.414 | 0.081 | 0.921 | 0.430 |  |
| 1 | 19 | 2 | 2.761 | 0.761 | 0.147 | 0.006 | 1.381 | 0.251 |  |
| Sat | 21 | 0 | $\underline{0.000}$ | 0.000 | 0.074 | 0.000 |  |  |  |

Notice that the best model for a fixed number of parameters does not depend on the choice of fit measure. For example, Model 7 is the best 17-parameter model according to $C-d f$, and also according to $C / d f$ and every other fit measure. This short list of best models is guaranteed to contain the overall best model, no matter which fit measure is used as the criterion for model selection.

You can view the short list at any time by clicking \# . The best-fit graph suggests the choice of 17 as the correct number of parameters on the heuristic grounds that it is the point of diminishing returns. That is, increasing the number of parameters from 16 to 17 buys a comparatively large improvement in $C(67.342-3.071=64.271)$, while increasing the number of parameters beyond 17 yields relatively small improvements.

## Viewing the Best-Fit Graph for Other Fit Measures

While Best fit is selected, try selecting the other choices in the Fit measure group: $\mathrm{C}-\mathrm{df}, \mathrm{AIC}, \mathrm{BCC}, \mathrm{BIC}$, and $\mathrm{C} / \mathrm{df}$. For example, if you click BIC, you will see this:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-ed480421b2.jpg)
$B I C$ is the measure among $C, C-d f, A I C, B C C$, and $B I C$ that imposes the greatest penalty for complexity. The high penalty for complexity is reflected in the steep positive slope of the graph as the number of parameters increases beyond 17. The graph makes it clear that, according to BIC, the best 17-parameter model is superior to any other candidate model.

Notice that clicking different fit measures changes the vertical axis of the best-fit graph and changes the shape of the configuration of points. ${ }^{1}$ However, the identity of each point is preserved. The best 16-parameter model is always Model 4, the best 17parameter model is always Model 7, and so on. This is because, for a fixed number of parameters, the rank order of models is the same for every fit measure.

1 The saturated model is missing from the $C / d f$ graph because $C / d f$ is not defined for the saturated model.

## Viewing the Scree Plot for C

In the Plot window, select Scree in the Plot type group.

In the Fit measure group, select C.
The Plot window displays the following graph:

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-02188951fe.jpg)
Figure 22-6: Scree plot for C

In this scree plot, the point with coordinate 17 on the horizontal axis has coordinate 64.271 on the vertical axis. This represents the fact that the best 17-parameter model ( $C=3.071$ ) fits better than the best 16-parameter model ( $C=67.342$ ), with the difference being $67.342-3.071=64.271$. Similarly, the height of the graph at 18 parameters shows the improvement in $C$ obtained by moving from the best 17parameter model to the best 18 -parameter model, and so on. The point located above 21 on the horizontal axis requires a separate explanation. There is no 20 -parameter model with which the best 21 -parameter model can be compared. (Actually, there is only one 21-parameter model-the saturated model.) The best 21-parameter model
( $C=0$ ) is therefore compared to the best 19-parameter model ( $C=2.761$ ). The height of the 21 -parameter point is calculated as $(2.761-0) / 2$. That is, the improvement in $C$ obtained by moving from the 19-parameter model to the 21parameter model is expressed as the amount of reduction in $C$ per parameter.

The figure on either p. 356 or p. 359 can be used to support a heuristic point of diminishing returns argument in favor of 17 parameters. There is this difference: In the best-fit graph (p. 356), one looks for an elbow in the graph, or a place where the slope changes from relatively steep to relatively flat. For the present problem, this occurs at 17 parameters, which can be taken as support for the best 17-parameter model. In the scree plot (p. 359), one also looks for an elbow, but the elbow occurs at 18 parameters in this example. This is also taken as support for the best 17-parameter model. In a scree plot, an elbow at $k$ parameters provides support for the best $(k-1)$ parameter model.

The scree plot is so named because of its similarity to the graph known as a scree plot in principal components analysis (Cattell, 1966). In principal components analysis, a scree plot shows the improvement in model fit that is obtained by adding components to the model, one component at a time. The scree plot presented here for SEM shows the improvement in model fit that is obtained by incrementing the number of model parameters. The scree plot for SEM is not identical in all respects to the scree plot for principal components analysis. For example, in principal components, one obtains a sequence of nested models when introducing components one at a time. This is not necessarily the case in the scree plot for SEM. The best 17-parameter model, say, and the best 18 -parameter model may or may not be nested. (In the present example, they are.) Furthermore, in principal components, the scree plot is always monotone non-increasing, which is not guaranteed in the case of the scree plot for SEM, even with nested models. Indeed, the scree plot for the present example is not monotone.

In spite of the differences between the traditional scree plot and the scree plot presented here, it is proposed that the new scree plot be used in the same heuristic fashion as the traditional one. A two-stage approach to model selection is suggested. In the first stage, the number of parameters is selected by examining either the scree plot or the short list of models. In the second stage, the best model is chosen from among those models that have the number of parameters determined in the first stage.

## Viewing the Scree Plot for Other Fit Measures

With Scree selected in the Plot type group, select the other choices in the Fit measure group: C-df, AIC, BCC, and BIC (but not C/df).

For example, if you select BIC, you will see this:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-18f32b8941.jpg)

For $C-d f, A I C, B C C$, and $B I C$, the units and the origin of the vertical axis are different than for $C$, but the graphs are otherwise identical. This means that the final model selected by the scree test is independent of which measure of fit is used (unless $C / d f$ is used). This is the advantage of the scree plot over the best-fit plot demonstrated earlier in this example (see "Viewing the Best-Fit Graph for C" on p. 356, and "Viewing the Best-Fit Graph for Other Fit Measures" on p. 358). The best-fit plot and the scree plot contain nearly the same information, but the shape of the best-fit plot depends on the choice of fit measure while the shape of the scree plot does not (with the exception of $C / d f$ ).

Both the best-fit plot and the scree plot are independent of sample size in the sense that altering the sample size without altering the sample moments has no effect other than to rescale the vertical axis.

## Specification Search with Many Optional Arrows

The previous specification search was largely confirmatory in that there were only three optional arrows. You can take a much more exploratory approach to constructing a model for the Felson and Bohrnstedt data. Suppose that your only hypothesis about the six measured variables is that

- academic depends on the other five variables, and
- attract depends on the other five variables.

The path diagram shown in Figure 22-7 with 11 optional arrows implements this hypothesis. It specifies which variables are endogenous, and nothing more. Every observed-variable model that is consistent with the hypothesis is included in the specification search. The covariances among the observed, exogenous variables could have been made optional, but doing so would have increased the number of optional arrows from 11 to 17, increasing the number of candidate models from 2,048 (that is, $2^{11}$ ) to 131,072 (that is, $2^{17}$ ). Allowing the covariances among the observed, exogenous variables to be optional would have been costly, and there would seem to be little interest in searching for models in which some pairs of those variables are uncorrelated.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-3921485894.jpg)
Figure 22-7: Highly exploratory model for Felson and Bohrnstedt's girls' data

## Specifying the Model

- Open %examples%\Ex22b.amw.

Tip: If the last file you opened was in the Examples folder, you can open the file by double-clicking it in the Files list to the left of the drawing area.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-e22b835860.jpg)

## Making Some Arrows Optional

- From the menus, choose Analyze $>$ Specification Search.
- Click ---- on the Specification Search toolbar, and then click the arrows in the path diagram until it looks like the diagram on p. 362.

Tip: You can change multiple arrows at once by clicking and dragging the mouse pointer through them.

## Setting Options to Their Defaults

- Click the Options button - on the Specification Search toolbar.
- In the Options dialog, click the Next search tab.


## Example 22

- In the Retain only the best $\_\_\_\_$ models box, change the value from 0 to 10.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-8a44021926.jpg)

This restores the default setting we altered earlier in this example. With the default setting, the program displays only the 10 best models according to whichever criterion you use for sorting the columns of the model list. This limitation is desirable now because of the large number of models that will be generated for this specification search.

- Click the Current results tab.
- In the BCC, AIC, BIC group, select Zero-based (min $=0$ ).


## Performing the Specification Search

- Click on the Specification Search toolbar.

The search takes about 10 seconds on a 1.8 GHz Pentium 4. When it finishes, the Specification Search window expands to show the results.

## Using BIC to Compare Models

In the Specification Search window，click the $\mathrm{BIC}_{0}$ column heading．This sorts the table according to $B I C_{0}$ ．

| ：Specification Search |  |  |  |  |  |  |  |  | □ □ |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| ．．．．－ <br> 250 <br> iiiiii <br> $\$_{100} \$_{00}$ 昇 <br> Y <br> 甘 解 <br> 昌 近 |  |  |  |  |  |  |  |  |  |
| Model | Params | df | C | C－df | $\mathrm{BCC}_{0}$ | $\mathbf{B I C}_{\mathbf{0}}$ | C／df | p | Notes |
| 22 | 15 | 6 | 5.156 | －0．844 | 0.132 | $\underline{0.000}$ | 0.859 | 0.524 |  |
| 32 | 16 | 5 | 2.954 | －2．046 | $\underline{0.000}$ | 3.141 | 0.591 | 0.707 |  |
| 33 | 16 | 5 | 3.101 | －1．899 | 0.147 | 3.288 | 0.620 | 0.684 |  |
| 34 | 16 | 5 | 4.623 | －0．377 | 1.669 | 4.810 | 0.925 | 0.464 |  |
| 35 | 16 | 5 | 4.623 | －0．377 | 1.669 | 4.810 | 0.925 | 0.464 |  |
| 36 | 16 | 5 | 4.623 | －0．377 | 1.669 | 4.810 | 0.925 | 0.464 |  |
| 37 | 16 | 5 | 5.055 | 0.055 | 2.101 | 5.242 | 1.011 | 0.409 | Unstable |
| 38 | 16 | 5 | 5.055 | 0.055 | 2.101 | 5.242 | 1.011 | 0.409 |  |
| 39 | 16 | 5 | 5.079 | 0.079 | 2.125 | 5.266 | 1.016 | 0.406 |  |
| 40 | 16 | 5 | 5.081 | 0.081 | 2.127 | 5.268 | 1.016 | 0.406 |  |

Figure 22－8：The 10 best models according to $\mathrm{BIC}_{0}$

The sorted table shows that Model 22 is the best model according to BIC ${ }_{0}$ ．（Model numbers depend in part on the order in which the objects in the path diagram were drawn；therefore，if you draw your own path diagram，your model numbers may differ from the model numbers here．）The second－best model according to $B I C_{0}$ ，namely Model 32，is the best according to $B C C_{0}$ ．These models are shown below：

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f33dba3d43.jpg)
Model 22

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f79c6ff4f1.jpg)
Model 32

## Viewing the Scree Plot

- Click on the Specification Search toolbar.
- In the Plot window, select Scree in the Plot type group.

The scree plot strongly suggests that models with 15 parameters provide an optimum trade-off of model fit and parsimony.

- Click the point with the horizontal coordinate 15. A pop-up appears that indicates the point represents Model 22, for which the change in chi-square is 46.22 .
- Click 22 (46.22) to display Model 22 in the drawing area.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-a9457f62c6.jpg)


## Limitations

The specification search procedure is limited to the analysis of data from a single group.

## Example <br> 23

## Exploratory Factor Analysis by Specification Search

## Introduction

This example demonstrates exploratory factor analysis by means of a specification search. In this approach to exploratory factor analysis, any measured variable can (optionally) depend on any factor. A specification search is performed to find the subset of single-headed arrows that provides the optimum combination of simplicity and fit. It also demonstrates a heuristic specification search that is practical for models that are too big for an exhaustive specification search.

## About the Data

This example uses the Holzinger and Swineford girls' (1939) data from Example 8.

## About the Model

The initial model is shown in Figure 23-1 on p. 368. During the specification search, all single-headed arrows that point from factors to measured variables will be made optional. The purpose of the specification search is to obtain guidance as to which single-headed arrows are essential to the model; in other words, which variables depend on which factors.

The two factor variances are both fixed at 1 , as are all the regression weights associated with residual variables. Without these constraints, all the models encountered during the specification search would be unidentified.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-e45d8e5fd9.jpg)
Figure 23-1: Exploratory factor analysis model with two factors

## Specifying the Model

- Open the file %examples%\Ex23.amw.

Initially, the path diagram appears as in Figure 23-1. There is no point in trying to fit this model as it stands because it is not identified, even with the factor variances fixed at 1 .

## Opening the Specification Search Window

- To open the Specification Search window, choose Analyze > Specification Search. Initially, only the toolbar is visible, as seen here:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-866fbbd62e.jpg)


## Making All Regression Weights Optional

- Click ---- on the Specification Search toolbar, and then click all the single-headed arrows in the path diagram.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-91e948d415.jpg)
Figure 23-2: Two-factor model with all regression weights optional

During the specification search, the program will attempt to fit the model using every possible subset of the optional arrows.

## Setting Options to Their Defaults

- Click the Options button on the Specification Search toolbar.
- In the Options dialog, click the Current results tab.
- Click Reset to ensure that your options are the same as those used in this example.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-c2b1dfc8ee.jpg)
- Now click the Next search tab. Notice that the default value for Retain only the best $\_\_\_\_$ models is 10.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-683ee7aaa3.jpg)

With this setting, the program will display only the 10 best models according to whichever criterion you use for sorting the columns of the model list. For example, if you click the column heading $C / d f$, the table will show the 10 models with the smallest values of $C / d f$, sorted according to $C / d f$. Scatterplots will display only the 10 best 1 -parameter models, the 10 best 2 -parameter models, and so on. It is useful to place a limit on the number of parameters to be displayed when there are a lot of optional parameters.

In this example, there are 12 optional parameters so that there are $2^{12}=4096$ candidate models. Storing results for a large number of models can affect performance. Limiting the display to the best 10 models for each number of parameters means that the program has to maintain a list of only about $10 \times 13=130$ models. The program will have to fit many more than 130 models in order to find the best 10 models for each number of parameters, but not quite as many as 4,096 . The program uses a branch-andbound algorithm similar to the one used in all-possible-subsets regression (Furnival and Wilson, 1974) to avoid fitting some models unnecessarily.

## Performing the Specification Search

- Click on the Specification Search toolbar.

The search takes about 12 seconds on a 1.8 GHz Pentium 4. When it finishes, the Specification Search window expands to show the results.

Initially, the list of models is not very informative. The models are listed in the order in which they were encountered, and the models encountered early in the search were found to be unidentified. The method used for classifying models as unidentified is described in Appendix D.

| Model | Params | df | C | C-df | $\mathrm{BCC}_{0}$ | $\mathrm{BIC}_{0}$ | c/df | p | Notes | - |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | 7 | 14 |  |  |  |  |  |  | Unidentified |  |
| 2 | 8 | 13 |  |  |  |  |  |  | Unidentified |  |
| 3 | 8 | 13 |  |  |  |  |  |  | Unidentified |  |
| 4 | 8 | 13 |  |  |  |  |  |  | Unidentified |  |
| 5 | 8 | 13 |  |  |  |  |  |  | Unidentified |  |
| 6 | 8 | 13 |  |  |  |  |  |  | Unidentified |  |
| 7 | 8 | 13 |  |  |  |  |  |  | Unidentified |  |
| 8 | 8 | 13 |  |  |  |  |  |  | Unidentified |  |
| 9 | 8 | 13 |  |  |  |  |  |  | Unidentified |  |
| 10 | 8 | 13 |  |  |  |  |  |  | Unidentified |  |

## Using BCC to Compare Models

- In the Specification Search window, click the column heading $B C C_{0}$.

The table sorts according to $B C C$ so that the best model according to $B C C$ (that is, the model with the smallest $B C C$ ) is at the top of the list.

| Model | Params | df | C | C-df | $\mathbf{B C C}_{\mathbf{0}}$ | $\mathrm{BIC}_{0}$ | C/df | p | Notes |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 52 | 13 | 8 | 7.853 | -0.147 | $\underline{0.000}$ | $\underline{0.000}$ | 0.982 | 0.448 |  |
| 53 | 13 | 8 | 7.853 | -0.147 | $\underline{0.000}$ | $\underline{0.000}$ | 0.982 | 0.448 |  |
| 62 | 14 | 7 | 5.770 | $\underline{-1.230}$ | 0.132 | 2.207 | 0.824 | $\underline{0.567}$ |  |
| 63 | 14 | 7 | 5.770 | $\underline{-1.230}$ | 0.132 | 2.207 | 0.824 | $\underline{0.567}$ |  |
| 65 | 14 | 7 | 7.155 | 0.155 | 1.517 | 3.593 | 1.022 | 0.413 |  |
| 64 | 14 | 7 | 7.155 | 0.155 | 1.517 | 3.593 | 1.022 | 0.413 |  |
| 67 | 14 | 7 | 7.608 | 0.608 | 1.971 | 4.046 | 1.087 | 0.368 |  |
| 66 | 14 | 7 | 7.608 | 0.608 | 1.971 | 4.046 | 1.087 | 0.368 |  |
| 68 | 14 | 7 | 7.632 | 0.632 | 1.995 | 4.070 | 1.090 | 0.366 |  |
| 69 | 14 | 7 | 7.632 | 0.632 | 1.995 | 4.070 | 1.090 | 0.366 |  |

Figure 23-3: The 10 best models according to $\mathrm{BCC}_{0}$

The two best models according to $B C C_{0}$ (Models 52 and 53) have identical fit measures (out to three decimal places anyway). The explanation for this can be seen from the path diagrams for the two models.

- In the Specification Search window, double-click the row for Model 52. This displays its path diagram in the drawing area.
- To see the path diagram for Model 53, double-click its row.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-947f339785.jpg)
Figure 23-4: Reversing F1 and F2 yields another candidate model

This is just one pair of models where reversing the roles of $F 1$ and $F 2$ changes one member of the pair into the other. There are other such pairs. Models 52 and 53 are equivalent, although they are counted separately in the list of 4,096 candidate models. The 10 models in Figure 23-3 on p. 372 come in five pairs, but candidate models do not always come in equivalent pairs, as Figure 23-5 illustrates. The model in that figure does not occur among the 10 best models for six optional parameters and is not identified for that matter, but it does illustrate how reversing $F 1$ and $F 2$ can fail to yield a different member of the set of 4,096 candidate models.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-bfd54200ba.jpg)
Figure 23-5: Reversing F1 and F2 yields the same candidate model

The occurrence of equivalent candidate models makes it unclear how to apply Bayesian calculations to select a model in this example. Similarly, it is unclear how to use Akaike weights. Furthermore, Burnham and Anderson's guidelines (see p. 344) for the interpretation of $B C C_{0}$ are based on reasoning about Akaike weights, so it is not clear whether those guidelines apply in the present example. On the other hand, the use of $B C C_{0}$ without reference to the Burnham and Anderson guidelines seems unexceptionable. Model 52 (or the equivalent Model 53) is the best model according to $B C C_{0}$.

Although $B C C_{0}$ chooses the model employed in Example 8, which was based on a model of Jöreskog and Sörbom (1996), it might be noted that Model 62 (or its equivalent, Model 63) is a very close second in terms of $B C C_{0}$ and is the best model according to some other fit measures. Model 63 has the following path diagram:

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-7c25b8f8c5.jpg)
Figure 23-6: Model 63

The factors, $F 1$ and $F 2$, seem roughly interpretable as spatial ability and verbal ability in both Models 53 and 63. The two models differ in their explanation of scores on the cubes test. In Model 53, cubes scores depend entirely on spatial ability. In Model 63, cubes scores depend on both spatial ability and verbal ability. Since it is a close call in terms of every criterion based on fit and parsimony, it may be especially appropriate here to pay attention to interpretability as a model selection criterion. The scree test in the following step, however, does not equivocate as to which is the best model.

## Viewing the Scree Plot

- Click on the Specification Search toolbar.
- In the Plot window, select Scree in the Plot type group.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-21eea3a24e.jpg)

The scree plot strongly suggests the use of 13 parameters because of the way the graph drops abruptly and then levels off immediately after the $13{ }^{\text {th }}$ parameter. Click the point with coordinate 13 on the horizontal axis. A pop-up shows that the point represents Models 52 and 53, as shown in Figure 23-4 on p. 373.

## Viewing the Short List of Models

- Click of of the Specification Search toolbar. Take note of the short list of models for future reference.


## Heuristic Specification Search

The number of models that must be fitted in an exhaustive specification search grows rapidly with the number of optional arrows. There are 12 optional arrows in Figure 23-2 on p. 369 so that an exhaustive specification search requires fitting $2^{12}=4096$ models. (The number of models will be somewhat smaller if you specify a small positive number for Retain only the best___ models on the Next search tab of the Options dialog.) A number of heuristic search procedures have been proposed for reducing the number of models that have to be fitted (Salhi, 1998). None of these is guaranteed to find the best model, but they have the advantage of being computationally feasible in problems with more than, say, 20 optional arrows where an exhaustive specification search is impossible.

Amos provides three heuristic search strategies in addition to the option of an exhaustive search. The heuristic strategies do not attempt to find the overall best model because this would require choosing a definition of best in terms of the minimum or maximum of a specific fit measure. Instead, the heuristic strategies attempt to find the 1-parameter model with the smallest discrepancy, the 2-parameter model with the smallest discrepancy, and so on. By adopting this approach, a search procedure can be designed that is independent of the choice of fit measure. You can select among the available search strategies on the Next search tab of the Options dialog. The choices are as follows:

- All subsets. An exhaustive search is performed. This is the default.
- Forward. The program first fits the model with no optional arrows. Then it adds one optional arrow at a time, always adding whichever arrow gives the largest reduction in discrepancy.
- Backward. The program first fits the model with all optional arrows in the model. Then it removes one optional arrow at a time, always removing whichever arrow gives the smallest increase in discrepancy.
- Stepwise. The program alternates between Forward and Backward searches, beginning with a Forward search. The program keeps track of the best 1 -optionalarrow model encountered, the best 2 -optional-arrow model, and so on. After the first Forward search, the Forward and Backward search algorithms are modified by the following rule: The program will add an arrow or remove an arrow only if the resulting model has a smaller discrepancy than any previously encountered model with the same number of arrows. For example, the program will add an arrow to a 5-optional-arrow model only if the resulting 6-optional-arrow model has a smaller discrepancy than any previously encountered 6-optional-arrow model. Forward and Backward searches are alternated until one Forward or Backward search is completed with no improvement.


## Performing a Stepwise Search

- Click the Options button □ on the Specification Search toolbar.
- In the Options dialog, click the Next search tab.
- Select Stepwise.
- On the Specification Search toolbar, click
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-21e5664f83.jpg)

The results in Figure 23-7 suggest examining the 13-parameter model, Model 7. Its discrepancy $C$ is much smaller than the discrepancy for the best 12-parameter model and not much larger than the best 14 -parameter model. Model 7 is also best according to both $B C C$ and $B I C$. (Your results may differ from those in the figure because of an element of randomness in the heuristic specification search algorithms. When adding an arrow during a forward step or removing an arrow during a backward step, there may not be a unique best choice. In that case, one arrow is picked at random from among the arrows that are tied for best.)

| : Specification Search <br> _ □![](https://ai-docs.amosdevelopment.com/Images/ug/ug-47c57a13e3.jpg) |  |  |  |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-c6d22df271.jpg) |  |  |  |  |  |  |  |  |  |
| Model | Params | df | C | C-df | $B C C_{0}$ | $\mathrm{BIC}_{0}$ | C/df | p | Notes <br> Notes |
| 1 | 7 | 14 |  |  |  |  |  |  | Unidentified |
| 2 | 8 | 13 |  |  |  |  |  |  | Unidentified |
| 3 | 9 | 12 |  |  |  |  |  |  | Unidentified |
| 4 | 10 | 11 |  |  |  |  |  |  | Unidentified |
| 5 | 11 | 10 | 97.475 | 87.475 | 85.191 | 81.041 | 9.747 | 0.000 |  |
| 6 | 12 | 9 | 33.469 | 24.469 | 23.401 | 21.326 | 3.719 | 0.000 |  |
| 7 | 13 | 8 | 7.853 | -0.147 | $\underline{0.000}$ | $\underline{0.000}$ | 0.982 | 0.448 |  |
| 8 | 14 | 7 | 5.770 | $\underline{-1.230}$ | 0.132 | 2.207 | 0.824 | $\underline{0.567}$ |  |
| 9 | 15 | 6 | 5.594 | -0.406 | 2.172 | 6.322 | 0.932 | 0.470 |  |
| 10 | 16 | 5 | 5.528 | 0.528 | 4.322 | 10.547 | 1.106 | 0.355 |  |
| 11 | 17 | 4 | 5.476 | 1.476 | 6.485 | 14.785 | 1.369 | 0.242 |  |
| 12 | 18 | 3 |  |  |  |  |  |  | Unidentified |
| 13 | 19 | 2 |  |  |  |  |  |  | Unidentified |
| Sat | 21 | 0 | $\underline{0.000}$ | 0.000 | 9.870 | 26.471 |  |  |  |

Figure 23-7: Results of stepwise specification search

## Viewing the Scree Plot

- Click on the Specification Search toolbar.
- In the Plot window, select Scree in the Plot type group.

The scree plot confirms that adding a $13{ }^{\text {th }}$ parameter provides a substantial reduction in discrepancy and that adding additional parameters beyond the $13{ }^{\text {th }}$ provides only slight reductions.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-45e692c453.jpg)
Figure 23-8: Scree plot after stepwise specification search

- Click the point in the scree plot with horizontal coordinate 13, as in Figure 23-8. The pop-up that appears shows that Model 7 is the best 13-parameter model.
- Click 7 (25.62) on the pop-up. This displays the path diagram for Model 7 in the drawing area.

Tip: You can also do this by double-clicking the row for Model 7 in the Specification Search window.

## Exploratory Factor Analysis by Specification Search

## Limitations of Heuristic Specification Searches

A heuristic specification search can fail to find any of the best models for a given number of parameters. In fact, the stepwise search in the present example did fail to find any of the best 11 -parameter models. As Figure $23-7$ on p. 377 shows, the best 11-parameter model found by the stepwise search had a discrepancy ( $C$ ) of 97.475. An exhaustive search, however, turns up two models that have a discrepancy of 55.382 . For every other number of parameters, the stepwise search did find one of the best models.

Of course, it is only when you can perform an exhaustive search to double-check the result of a heuristic search that you can know whether the heuristic search was successful. In those problems where a heuristic search is the only available technique, not only is there no guarantee that it will find one of the best models for each number of parameters, but there is no way to know whether it has succeeded in doing so.

Even in those cases where a heuristic search finds one of the best models for a given number of parameters, it does not (as implemented in Amos) give information about other models that fit equally as well or nearly as well.

## Multiple-Group Factor Analysis

## Introduction

This example demonstrates a two-group factor analysis with automatic specification of cross-group constraints.

## About the Data

This example uses the Holzinger and Swineford girls' and boys' (1939) data from Examples 12 and 15.

## Model 24a: Modeling Without Means and Intercepts

The presence of means and intercepts as explicit model parameters adds to the complexity of a multiple-group analysis. The treatment of means and intercepts will be postponed until Model 24b. For now, consider fitting the following factor analysis model, with no explicit means and intercepts, to the data of girls and of boys:

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-1c3253f79b.jpg)
Figure 24-1: Two-factor model for girls and boys

This is the same two-group factor analysis problem that was considered in Example 12. The results obtained in Example 12 will be obtained here automatically.

## Specifying the Model

- From the menus, choose File > Open.
- In the Open dialog, enter the file name %examples% $\mid \operatorname{Ex} 24 a . a m w$, and then click the Open button.

The path diagram is the same for boys as for girls and is shown in Figure 24-1. Some regression weights are fixed at 1 . These regression weights will remain fixed at 1 throughout the analysis to follow. The assisted multiple-group analysis adds constraints to the model you specify but does not remove any constraints.

## Opening the Multiple-Group Analysis Dialog Box

- From the menus, choose Analyze $>$ Multiple-Group Analysis.
- Click OK in the message box that appears. This opens the Multiple-Group Analysis dialog.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-9ef42b9397.jpg)
Figure 24-2: The Multiple-Group Analysis dialog

Most of the time, you will simply click OK. This time, however, let's take a look at some parts of the Multiple-Group Analysis dialog.

There are eight columns of check boxes. Check marks appear only in the columns labeled 1,2 , and 3 . This means that the program will generate three models, each with a different set of cross-group constraints.

Column 1 contains a single check mark in the row labeled Measurement weights, which is short for regression weights in the measurement part of the model. In the case of a factor analysis model, these are the factor loadings. The following section shows you how to view the measurement weights in the path diagram. Column 1 generates a model in which measurement weights are constant across groups (that is, the same for boys as for girls).

Column 2 contains check marks for Measurement weights and also Structural covariances, which is short for variances and covariances in the structural part of the model. In a factor analysis model, these are the factor variances and covariances. The following section shows you how to view the structural covariances in the path diagram. Column 2 generates a model in which measurement weights and structural covariances are constant across groups.

Column 3 contains all the check marks in column 2 and also a check mark next to Measurement residuals, which is short for variances and covariances of residual (error) variables in the measurement part of the model. The following section shows you how to view the measurement residuals in the path diagram. The three parameter subsets that appear in a black (that is, not gray) font are mutually exclusive and exhaustive, so that column 3 generates a model in which all parameters are constant across groups.

In summary, columns 1 through 3 generate a hierarchy of models in which each model contains all the constraints of its predecessor. First, the factor loadings are held constant across groups. Then, the factor variances and covariances are held constant. Finally, the residual (unique) variances are held constant.

## Viewing the Parameter Subsets

- In the Multiple-Group Analysis dialog, click Measurement weights.

The measurement weights are now displayed in color in the drawing area. If there is a check mark next to Alternative to color on the Accessibility tab of the Interface Properties dialog, the measurement weights will also display as thick lines, as shown here:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-2f3766b92f.jpg)

- Click Structural covariances to see the factor variances and covariances emphasized.
- Click Measurement residuals to see the error variables emphasized.

This is an easy way to visualize which parameters are affected by each cross-group constraint.

## Viewing the Generated Models

In the Multiple-Group Analysis dialog, click OK.
The path diagram now shows names for all parameters. In the panel at the left of the path diagram, you can see that the program has generated three new models in addition to an Unconstrained model in which there are no cross-group constraints at all.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-71113fca50.jpg)
Figure 24-3: Amos Graphics window after automatic constraints

- Double-click XX: Measurement weights. This opens the Manage Models dialog, which shows you the constraints that require the factor loadings to be constant across groups.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-d167b8209c.jpg)


## Fitting All the Models and Viewing the Output

- From the menus, choose Analyze > Calculate Estimates to fit all models.
- From the menus, choose View > Text Output.
- In the navigation tree of the output viewer, click the Model Fit node to expand it, and then click CMIN.

The CMIN table shows the likelihood ratio chi-square statistic for each fitted model. The data do not depart significantly from any of the models. Furthermore, at each step up the hierarchy from the Unconstrained model to the Measurement residuals model, the increase in chi-square is never much larger than the increase in degrees of freedom. There appears to be no significant evidence that girls' parameter values differ from boys' parameter values.

Here is the CMIN table:

| Model | NPAR | CMIN | DF | P | CMIN/DF |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Unconstrained | 26 | 16.48 | 16 | 0.42 | 1.03 |
| Measurement weights | 22 | 18.29 | 20 | 0.57 | 0.91 |
| Structural covariances | 19 | 22.04 | 23 | 0.52 | 0.96 |
| Measurement residuals | 13 | 26.02 | 29 | 0.62 | 0.90 |
| Saturated model | 42 | 0.00 | 0 |  |  |
| Independence model | 12 | 337.55 | 30 | 0.00 | 11.25 |

- In the navigation tree, click AIC under the Model Fit node.
$A I C$ and $B C C$ values indicate that the best trade-off of model fit and parsimony is obtained by constraining all parameters to be equal across groups (the Measurement residuals model).

Here is the AIC table:

| Model | AIC | BCC | BIC | CAIC |
| :--- | :--- | :--- | :--- | :--- |
| Unconstrained | 68.48 | 74.12 |  |  |
| Measurement weights | 62.29 | 67.07 |  |  |
| Structural covariances | 60.04 | 64.16 |  |  |
| Measurement residuals | 52.02 | 54.84 |  |  |
| Saturated model | 84.00 | 93.12 |  |  |
| Independence model | 361.55 | 364.16 |  |  |

## Customizing the Analysis

There were two opportunities to override the automatically generated cross-group constraints. In Figure 24-2 on p. 383, you could have changed the check marks in columns 1,2 , and 3 , and you could have generated additional models by placing check marks in columns 4 through 8 . Then, in Figure 24-3 on p. 385, you could have renamed or modified any of the automatically generated models listed in the panel at the left of the path diagram.

## Model 24b: Comparing Factor Means

Introducing explicit means and intercepts into a model raises additional questions about which cross-group parameter constraints should be tested, and in what order. This example shows how Amos constrains means and intercepts while fitting the factor analysis model in Figure 24-1 on p. 382 to data from separate groups of girls and boys.

This is the same two-group factor analysis problem that was considered in Example 15. The results in Example 15 will be obtained here automatically.

## Specifying the Model

- From the menus, choose File > Open.
- In the Open dialog, enter the file name %examples% $\mid \operatorname{Ex24b.amw}$, and then click the Open button.

The path diagram is the same for boys as for girls and is shown below. Some regression weights are fixed at 1 . The means of all the unobserved variables are fixed at 0 . In the following section, you will remove the constraints on the girls' factor means. The other constraints (the ones that you do not remove) will remain in effect throughout the analysis.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-3440a10156.jpg)
Figure 24-4: Two-factor model with explicit means and intercepts

## Removing Constraints

Initially, the factor means are fixed at 0 for both boys and girls. It is not possible to estimate factor means for both groups. However, Sörbom (1974) showed that, by fixing the factor means of a single group to constant values and placing suitable constraints on the regression weights and intercepts in a factor model, it is possible to obtain meaningful estimates of the factor means for all of the other groups. In the present example, this means picking one group, say boys, and fixing their factor means to a constant, say 0 , and then removing the constraints on the factor means of the remaining group, the girls. The constraints on regression weights and intercepts required by Sörbom's approach will be generated automatically by Amos.

The boys' factor means are already fixed at 0 . To remove the constraints on the girls' factor means, do the following:

- In the drawing area of the Amos Graphics window, right-click Spatial and choose Object Properties from the pop-up menu.
- In the Object Properties dialog, click the Parameters tab.
- Select the 0 in the Mean box, and press the Delete key.
- With the Object Properties dialog still open, click Verbal in the drawing area. This displays the properties for the verbal factor in the Object Properties dialog.
- In the Mean box on the Parameters tab, select the 0 and press the Delete key.
- Close the Object Properties dialog.

Now that the constraints on the girls' factor means have been removed, the girls' and boys’ path diagrams look like this:

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-26303ae519.jpg)
Girls

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-d47e56f243.jpg)
Boys

Tip: To switch between path diagrams in the drawing area, click either Boys or Girls in the List of Groups pane to the left.

## Generating the Cross－Group Constraints

－From the menus，choose Analyze $>$ Multiple－Group Analysis．
－Click OK in the message box that appears．This opens the Multiple－Group Analysis dialog．

| Multiple－Group Analysis |  |  |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| Parameter Subsets |  |  |  | Models |  |  |  |  |
|  | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
| Measurement weights | F | V | F | v | V | г | г | Г |
| Measurement intercepts | Γ | F | F | F | ワ | Г | Г | Г |
| Structural weights | Γ | Г | F | F | ワ | Г | Γ | Г |
| Structural intercepts | Γ | Γ | ワ | ∇ | ワ | Γ | Γ | Γ |
| Structural means | Г | Г | F | V | V | Г | Г | Г |
| Structural covariances | Г | Г | Г | F | ワ | Г | Г | Γ |
| Structural residuals | Γ | Γ | Г | Г | ワ | Г | Γ | Γ |
| Measurement residuals | Г | $\Gamma$ | Γ | $\Gamma$ | ワ | Г | $\Gamma$ | $\Gamma$ |
| Help | Default |  |  | OK |  |  |  | Cancel |

The default settings，as shown above，will generate the following nested hierarchy of five models：

| Model | Constraints |
| :--- | :--- |
| Model 1 （column 1） | Measurement weights（factor loadings）are equal across groups． |
| Model 2 （column 2） | All of the above，and measurement intercepts（intercepts in the equations for predicting measured variables）are equal across groups． |
| Model 3 （column 3） | All of the above，and structural means（factor means）are equal across groups． |
| Model 4 （column 4） | All of the above，and structural covariances（factor variances and covariances）are equal across groups． |
| Model 5 （column 5） | All parameters are equal across groups． |

－Click OK．

## Fitting the Models

- From the menus, choose Analyze $>$ Calculate Estimates.

The panel at the left of the path diagram shows that two models could not be fitted to the data. The two models that could not be fitted, the Unconstrained model with no cross-group constraints, and the Measurement weights model with factor loadings held equal across groups, are unidentified.

XX: Unconstrained
XX: Measurement weights
OK: Measurement intercepts
OK: Structural means
OK: Structural covariances
OK: Measurement residua's

## Viewing the Output

- From the menus, choose View > Text Output.
- In the navigation tree of the output viewer, expand the Model Fit node.

Some fit measures for the four automatically generated and identified models are shown here, along with fit measures for the saturated and independence models.

- Click CMIN under the Model Fit node.

The CMIN table shows that none of the generated models can be rejected when tested against the saturated model.

| Model | NPAR | CMIN | DF | P | CMIN/DF |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Measurement intercepts | 30 | 22.593 | 24 | 0.544 | 0.941 |
| Structural means | 28 | 30.624 | 26 | 0.243 | 1.178 |
| Structural covariances | 25 | 34.381 | 29 | 0.226 | 1.186 |
| Measurement residuals | 19 | 38.459 | 35 | 0.316 | 1.099 |
| Saturated model | 54 | 0.00 | 0 |  |  |
| Independence model | 24 | 337.553 | 30 | 0.00 | 11.252 |

On the other hand, the change in chi-square $(30.62-22.59=8.03)$ when introducing the equal-factor-means constraint looks large compared to the change in degrees of freedom ( $26-24=2$ ).

- In the navigation tree, click the Model Comparison node.

Assuming model Measurement intercepts to be correct, the following table shows that this chi-square difference is significant:

| Model | DF | CMIN | P | NFI <br> Delta-1 | IFI <br> Delta-2 | RFI <br> rho-1 | TLI <br> rho2 |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| Structural means | 2 | 8.030 | 0.018 | 0.024 | 0.026 | 0.021 | 0.023 |
| Structural covariances | 5 | 11.787 | 0.038 | 0.035 | 0.038 | 0.022 | 0.024 |
| Measurement residuals | 11 | 15.865 | 0.146 | 0.047 | 0.051 | 0.014 | 0.015 |

In the preceding two tables, two chi-square statistics and their associated degrees of freedom are especially important. The first, $\chi^{2}=22.59$ with $d f=24$, allowed accepting the hypothesis of equal intercepts and equal regression weights in the measurement model. It was important to establish the credibility of this hypothesis because, without equal intercepts and equal regression weights, it would be unclear that the factors have the same meaning for boys as for girls and so there would be no interest in comparing their means. The other important chi-square statistic, $\chi^{2}=8.03$ with $d f=2$, leads to rejection of the hypothesis that boys and girls have the same factor means.

Group differences between the boys' and girls' factor means can be determined from the girls' estimates in the Measurement intercepts model.

- Select the Measurement intercepts model in the pane at the lower left of the output viewer.
- In the navigation tree, click Estimates, then Scalars, and then Means.

The boys' means were fixed at 0 , so only the girls' means were estimated, as shown in the following table:

|  |  |  | Estimate | S.E. | C.R. | P | Label |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| spatial |  |  | -1.066 | 0.881 | -1.209 | 0.226 | $\mathrm{~m} 1 \_1$ |
| verbal |  |  | 0.956 | 0.521 | 1.836 | 0.066 | $\mathrm{~m} 2 \_1$ |

These estimates were discussed in Model A of Example 15, which is identical to the present Measurement intercepts model. (Model B of Example 15 is identical to the present Structural means model.)

## Example <br> 25

## Multiple-Group Analysis

## Introduction

This example shows you how to automatically implement Sörbom's alternative to analysis of covariance.

Example 16 demonstrates the benefits of Sörbom's approach to analysis of covariance with latent variables. Unfortunately, as Example 16 also showed, the Sörbom approach is difficult to apply, involving many steps. This example automatically obtains the same results as Example 16.

## About the Data

The Olsson (1973) data from Example 16 will be used here. The sample moments can be found in the workbook UserGuide.xls. Sample moments from the experimental group are in the worksheet Olss_exp. Sample moments from the control group are in the worksheet Olss_cnt.

## About the Model

The model was described in Example 16. The Sörbom method requires that the experimental and the control group have the same path diagram.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-92eb6f027d.jpg)
Figure 25-1: Sörbom model for Olsson data

## Specifying the Model

- Open %examples%\Ex25.amw.

The path diagram is the same for the control and experimental groups and is shown in Figure 25-1. Some regression weights are fixed at 1 . The means of all the residual (error) variable means are fixed at 0 . These constraints will remain in effect throughout the analysis.

## Constraining the Latent Variable Means and Intercepts

The model in Figure 25-1, Sörbom's model for Olsson data, is unidentified and will remain unidentified for every set of cross-group constraints that Amos automatically generates. For every set of cross-group constraints, the mean of pre_verbal and the intercept in the equation for predicting post_verbal will be unidentified. In order to allow the model to be identified for at least some cross-group constraints, it is necessary to pick one group, such as the control group, and fix the pre_verbal mean and the post_verbal intercept to a constant, such as 0 .

- In the List of Groups pane to the left of the path diagram, ensure that Control is selected. This indicates that the path diagram for the control group is displayed in the drawing area.
- In the drawing area, right-click pre_verbal and choose Object Properties from the popup menu.
- In the Object Properties dialog, click the Parameters tab.
- In the Mean text box, type 0.
- With the Object Properties dialog still open, click post_verbal in the drawing area.
- In the Intercept text box of the Object Properties dialog, type 0.
- Close the Object Properties dialog.

Now, the path diagram for the control group appears as follows:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-5464132781.jpg)

The path diagram for the experimental group continues to look like Figure 25-1.

## Generating Cross-Group Constraints

- From the menus, choose Analyze $>$ Multiple-Group Analysis.
- Click OK in the message box that appears.

Example 25

The Multiple-Group Analysis dialog appears.
| Multiple-Group Analysis |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| Parameter Subsets |  |  |  | Models |  |  |
|  | 1 | 2 | 3 | 4 | 6 | 7 |
| Measurement weights | F | F | V | F | V | F |
| Measurement intercepts | Γ | F | F | F | F | F |
| Structural weights | Γ | Г | ワ | F | V | F |
| Structural intercepts | Г | г | г | F | V | V |
| Structural means | Г | Г | Г | Г | F | F |
| Structural covariances | Г | Г | Г | Г | F | F |
| Structural residuals | Г | Г | Г | Г | г | V |
| Measurement residuals | Γ | $\Gamma$ | $\Gamma$ | $\Gamma$ | $\Gamma$ | $\Gamma$ |
| □ <br> Help | □ <br> Default |  |  | □ <br> OK |  | □ <br> Cancel |


- Click OK to generate the following nested hierarchy of eight models:

| Model | Constraints |
| :--- | :--- |
| Model 1 (column 1) | Measurement weights (factor loadings) are constant across groups. |
| Model 2 (column 2) | All of the above, and measurement intercepts (intercepts in the equations for predicting measured variables) are constant across groups. |
| Model 3 (column 3) | All of the above, and the structural weight (the regression weight for predicting post_verbal) is constant across groups. |
| Model 4 (column 4) | All of the above, and the structural intercept (the intercept in the equation for predicting post_verbal) is constant across groups. |
| Model 5 (column 5) | All of the above, and the structural mean (the mean of pre_verbal) is constant across groups. |
| Model 6 (column 6) | All of the above, and the structural covariance (the variance of pre_verbal) is constant across groups. |
| Model 7 (column 7) | All of the above, and the structural residual (the variance of zeta) is constant across groups. |
| Model 8 (column 8) | All parameters are constant across groups. |

## Fitting the Models

- From the menus, choose Analyze $>$ Calculate Estimates.

The panel to the left of the path diagram shows that two models could not be fitted to the data. The two models that could not be fitted, the Unconstrained model and the Measurement weights model, are unidentified.

XX: Unconstrained
XX: Measurement weights
OK: Measurement intercepts
OK: Structural weights
OK: Structural intercepts
OK: Structural means
OK: Structural covariances
OK: Structural residuals
OK: Measurement residuals

## Viewing the Text Output

- From the menus, choose View > Text Output.
- In the navigation tree of the output viewer, expand the Model Fit node, and click CMIN. This displays some fit measures for the seven automatically generated and identified models, along with fit measures for the saturated and independence models, as shown in the following CMIN table:

| Model | NPAR | CMIN | DF | P | CMIN/DF |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Measurement intercepts | 22 | 34.775 | 6 | 0.000 | 5.796 |
| Structural weights | 21 | 36.340 | 7 | 0.000 | 5.191 |
| Structural intercepts | 20 | 84.060 | 8 | 0.000 | 10.507 |
| Structural means | 19 | 94.970 | 9 | 0.000 | 10.552 |
| Structural covariances | 18 | 99.976 | 10 | 0.000 | 9.998 |
| Structural residuals | 17 | 112.143 | 11 | 0.000 | 10.195 |
| Measurement residuals | 13 | 122.366 | 15 | 0.000 | 8.158 |
| Saturated model | 28 | 0.000 | 0 |  |  |
| Independence model | 16 | 682.638 | 12 | 0.000 | 56.887 |

Example 25

There are many chi-square statistics in this table, but only two of them matter. The Sörbom procedure comes down to two basic questions. First, does the Structural weights model fit? This model specifies that the regression weight for predicting post_verbal from pre_verbal be constant across groups.

If the Structural weights model is accepted, one follows up by asking whether the next model up the hierarchy, the Structural intercepts model, fits significantly worse. On the other hand, if the Structural weights model has to be rejected, one never gets to the question about the Structural intercepts model. Unfortunately, that is the case here. The Structural weights model, with $\chi^{2}=36.34$ and $d f=7$, is rejected at any conventional significance level.

## Examining the Modification Indices

To see if it is possible to improve the fit of the Structural weights model:

- Close the output viewer.
- From the Amos Graphics menus, choose View > Analysis Properties.
- Click the Output tab and select the Modification Indices check box.
- Close the Analysis Properties dialog.
- From the menus, choose Analyze > Calculate Estimates to fit all models.

Only the modification indices for the Structural weights model need to be examined because this is the only model whose fit is essential to the analysis.

- From the menus, choose View > Text Output, select Modification Indices in the navigation tree of the output viewer, then select Structural weights in the lower left panel.
- Expand the Modification Indices node and select Covariances.

As you can see in the following covariance table for the control group, only one modification index exceeds the default threshold of 4:

|  | M.I. | Par Change |
| :--- | :--- | :--- |
| eps2 <--> eps4 | 4.553 | 2.073 |

- Now click experimental in the panel on the left. As you can see in the following covariance table for the experimental group, there are four modification indices greater than 4 :

|  | M.I. | Par Change |
| :--- | :--- | :--- |
| eps2 <--> eps4 | 9.314 | 4.417 |
| eps2 <--> eps3 | 9.393 | -4.117 |
| eps1 <--> eps4 | 8.513 | -3.947 |
| eps1 <--> eps3 | 6.192 | 3.110 |

Of these, only two modifications have an obvious theoretical justification: allowing eps2 to correlate with eps4, and allowing eps1 to correlate with eps3. Between these two, allowing eps2 to correlate with eps4 has the larger modification index. Thus the modification indices from the control group and the experimental group both suggest allowing eps2 to correlate with eps4.

## Modifying the Model and Repeating the Analysis

- Close the output viewer.
- From the menus, choose Diagram $>$ Draw Covariances.
- Click and drag to draw a double-headed arrow between eps2 and eps4.
- From the menus, choose Analyze > Multiple-Group Analysis, and click OK in the message box that appears.
- In the Multiple-Group Analysis dialog, click OK.
- From the menus, choose Analyze > Calculate Estimates to fit all models.
- From the menus, choose View > Text Output.
- Use the navigation tree to view the fit measures for the Structural weights model.

With the additional double-headed arrow connecting eps2 and eps4, the Structural weights model has an adequate fit ( $\chi^{2}=3.98$ with $d f=5$ ), as shown in the following CMIN table:

Example 25

| Model | NPAR | CMIN | DF | P | CMIN/DF |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Measurement intercepts | 24 | 2.797 | 4 | 0.59 | 0.699 |
| Structural weights | 23 | 3.976 | 5 | 0.55 | 0.795 |
| Structural intercepts | 22 | 55.094 | 6 | 0.00 | 9.182 |
| Structural means | 21 | 63.792 | 7 | 0.00 | 9.113 |
| Structural covariances | 20 | 69.494 | 8 | 0.00 | 8.687 |
| Structural residuals | 19 | 83.194 | 9 | 0.00 | 9.244 |
| Measurement residuals | 14 | 93.197 | 14 | 0.00 | 6.657 |
| Saturated model | 28 | 0.000 | 0 |  |  |
| Independence model | 16 | 682.638 | 12 | 0.00 | 56.887 |

Now that the Structural weights model fits the data, it can be asked whether the Structural intercepts model fits significantly worse. Assuming the Structural weights model to be correct:

| Model | DF | CMIN | P | NFI Delta-1 | IFI Delta-2 | RFI rho-1 | TLI rho2 |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| Structural intercepts | 1 | 51.118 | 0.000 | 0.075 | 0.075 | 0.147 | 0.150 |
| Structural means | 2 | 59.816 | 0.000 | 0.088 | 0.088 | 0.146 | 0.149 |
| Structural covariances | 3 | 65.518 | 0.000 | 0.096 | 0.097 | 0.139 | 0.141 |
| Structural residuals | 4 | 79.218 | 0.000 | 0.116 | 0.117 | 0.149 | 0.151 |
| Measurement residuals | 9 | 89.221 | 0.000 | 0.131 | 0.132 | 0.103 | 0.105 |

The Structural intercepts model does fit significantly worse than the Structural weights model. When the intercept in the equation for predicting post_verbal is required to be constant across groups, the chi-square statistic increases by 51.12 while degrees of freedom increases by only 1 . That is, the intercept for the experimental group differs significantly from the intercept for the control group. The intercept for the experimental group is estimated to be 3.627 .

|  | Estimate | S.E. | C.R. | P | Label |
| :--- | :--- | :--- | :--- | :--- | :--- |
| post_verbal | 3.627 | 0.478 | 7.591 | $<0.001$ | j1_2 |
| pre_syn | 18.619 | 0.594 | 31.355 | $<0.001$ | i1_1 |
| pre_opp | 19.910 | 0.541 | 36.781 | $<0.001$ | i2_1 |
| post_syn | 20.383 | 0.535 | 38.066 | $<0.001$ | i3_1 |
| post_opp | 21.204 | 0.531 | 39.908 | $<0.001$ | i4_1 |

Recalling that the intercept for the control group was fixed at 0 , it is estimated that the treatment increases post_verbal scores by 3.63 with pre_verbal held constant.

The results obtained in the present example are identical to the results of Example 16. The Structural weights model is the same as Model D in Example 16. The Structural intercepts model is the same as Model E in Example 16.

## Example <br> 26

## Bayesian Estimation

## Introduction

This example demonstrates Bayesian estimation using Amos.

## Bayesian Estimation

In maximum likelihood estimation and hypothesis testing, the true values of the model parameters are viewed as fixed but unknown, and the estimates of those parameters from a given sample are viewed as random but known. An alternative kind of statistical inference, called the Bayesian approach, views any quantity that is unknown as a random variable and assigns it a probability distribution. From a Bayesian standpoint, true model parameters are unknown and therefore considered to be random, and they are assigned a joint probability distribution. This distribution is not meant to suggest that the parameters are varying or changing in some fashion. Rather, the distribution is intended to summarize our state of knowledge, or what is currently known about the parameters. The distribution of the parameters before the data are seen is called a prior distribution. Once the data are observed, the evidence provided by the data is combined with the prior distribution by a well-known formula called Bayes' Theorem. The result is an updated distribution for the parameters, called a posterior distribution, which reflects a combination of prior belief and empirical evidence (Bolstad and Curran, 2017).

Example 26

Human beings tend to have difficulty visualizing and interpreting the joint posterior distribution for the parameters of a model. Therefore, when performing a Bayesian analysis, one needs summaries of the posterior distribution that are easy to interpret. A good way to start is to plot the marginal posterior density for each parameter, one at a time. Often, especially with large data samples, the marginal posterior distributions for parameters tend to resemble normal distributions. The mean of a marginal posterior distribution, called a posterior mean, can be reported as a parameter estimate. The posterior standard deviation, the standard deviation of the distribution, is a useful measure of uncertainty similar to a conventional standard error.

The analogue of a confidence interval may be computed from the percentiles of the marginal posterior distribution; the interval that runs from the 2.5 percentile to the 97.5 percentile forms a Bayesian 95% credible interval. If the marginal posterior distribution is approximately normal, the $95 %$ credible interval will be approximately equal to the posterior mean $\pm 1.96$ posterior standard deviations. In that case, the credible interval becomes essentially identical to an ordinary confidence interval that assumes a normal sampling distribution for the parameter estimate. If the posterior distribution is not normal, the interval will not be symmetric about the posterior mean. In that case, the Bayesian version often has better properties than the conventional one.

Unlike a conventional confidence interval, the Bayesian credible interval is interpreted as a probability statement about the parameter itself; $\operatorname{Prob}(a \leq \theta \leq b)=0.95$ ) literally means that you are 95% sure that the true value of $\theta$ lies between $a$ and $b$. Tail areas from a marginal posterior distribution can even be used as a kind of Bayesian $p$ value for hypothesis testing. If $96.5 %$ of the area under the marginal posterior density for $\theta$ lies to the right of some value $a$, then the Bayesian $p$ value for testing the null hypothesis $\theta \leq a$ against the alternative hypothesis $\theta>a$ is 0.045 . In that case, one would actually say, I'm 96.5% sure that the alternative hypothesis is true.

Although the idea of Bayesian inference dates back to the late $18{ }^{\text {th }}$ century, its use by statisticians has been rare until recently. For some, reluctance to apply Bayesian methods stems from a philosophical distaste for viewing probability as a state of belief and from the inherent subjectivity in choosing prior distributions. But for the most part, Bayesian analyses have been rare because computational methods for summarizing joint posterior distributions have been difficult or unavailable. Using a new class of simulation techniques called Markov chain Monte Carlo (MCMC), however, it is now possible to draw random values of parameters from high-dimensional joint posterior distributions, even in complex problems. With MCMC, obtaining posterior summaries becomes as simple as plotting histograms and computing sample means and percentiles.

## Selecting Priors

A prior distribution quantifies the researcher's belief concerning where the unknown parameter may lie. Knowledge of how a variable is distributed in the population can sometimes be used to help researchers select reasonable priors for parameters of interest. Hox (2002) cites the example of a normed intelligence test with a mean of 100 units and a standard deviation of 15 units in the general population. If the test is given to participants in a study who are fairly representative of the general population, then it would be reasonable to center the prior distributions for the mean and standard deviation of the test score at 100 and 15, respectively. Knowing that an observed variable is bounded may help us to place bounds on the parameters. For instance, the mean of a Likert-type survey item taking values $0,1, \ldots, 10$ must lie between 0 and 10 , and its maximum variance is 25 . Prior distributions for the mean and variance of this item can be specified to enforce these bounds.

In many cases, one would like to specify prior distribution that introduces as little information as possible, so that the data may be allowed to speak for themselves. A prior distribution is said to be diffuse if it spreads its probability over a very wide range of parameter values. By default, Amos applies a uniform distribution from $-3.4 \times 10^{-38}$ to $3.4 \times 10^{38}$ to each parameter.

Diffuse prior distributions are often said to be non-informative, and we will use that term as well. In a strict sense, however, no prior distribution is ever completely noninformative, not even a uniform distribution over the entire range of allowable values, because it would cease to be uniform if the parameter were transformed. (Suppose, for example, that the variance of a variable is uniformly distributed from 0 to $\infty$; then the standard deviation will not be uniformly distributed.) Every prior distribution carries with it at least some information. As the size of a dataset grows, the evidence from the data eventually swamps this information, and the influence of the prior distribution diminishes. Unless your sample is unusually small or if your model and/or prior distribution are strongly contradicted by the data, you will find that the answers from a Bayesian analysis tend to change very little if the prior is changed. Amos makes it easy for you to change the prior distribution for any parameter, so you can easily perform this kind of sensitivity check.

## Performing Bayesian Estimation Using Amos Graphics

To illustrate Bayesian estimation using Amos Graphics, we revisit Example 3, which shows how to test the null hypothesis that the covariance between two variables is 0 by fixing the value of the covariance between age and vocabulary to 0 .
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f8122de820.jpg)

## Estimating the Covariance

The first thing we need to do for the present example is to remove the zero constraint on the covariance so that the covariance can be estimated.

- Open %examples%\Ex03.amw.
- Right-click the double-headed arrow in the path diagram and choose Object Properties from the pop-up menu.
- In the Object Properties dialog, click the Parameters tab.
- Delete the 0 in the Covariance text box.
- Close the Object Properties dialog.

This is the resulting path diagram (you can also find it in Ex26.amw):

$$
\begin{aligned}
& \text { Chi-square }=\text { icmin }(\text { ldf df }) \\
& p=1 p
\end{aligned}
$$

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-584430930c.jpg)

Example 26
Bayesian Estimation Attig's (1983) old subjects Model Specification

## Results of Maximum Likelihood Analysis

Before performing a Bayesian analysis of this model, we perform a maximum likelihood analysis for comparison purposes.

- From the menus, choose Analyze > Calculate Estimates to display the following parameter estimates and standard errors:

| Covariances: (Group number 1 - Default model) |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- |
|  | Estimate | S.E. | C.R. | P | Label |
| age $\langle--\rangle$ vocabulary | -5.014 | 8.560 | -0.586 | 0.558 |  |


| Variances: (Group number 1 - Default model) |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- |
|  | Estimate | S.E. | C.R. | P | Label |
| age | 21.574 | 4.886 | 4.416 | $* * *$ |  |
| vocabulary | 131.294 | 29.732 | 4.416 | $* * *$ |  |

## Bayesian Analysis

Bayesian analysis requires estimation of explicit means and intercepts. Before performing any Bayesian analysis in Amos, you must first tell Amos to estimate means and intercepts.

From the menus, choose View > Analysis Properties.

- Select Estimate means and intercepts. (A check mark will appear next to it.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-5156a64569.jpg)
- To perform a Bayesian analysis, from the menus, choose Analyze > Bayesian Estimation, or press the keyboard combination $\mathrm{Ctrl}+\mathrm{B}$.

The Bayesian SEM window appears, and the MCMC algorithm immediately begins generating samples.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-94da22ad5c.jpg)

The Bayesian SEM window has a toolbar near the top of the window and has a results summary table below. Each row of the summary table describes the marginal posterior distribution of a single model parameter. The first column, labeled Mean, contains the posterior mean, which is the center or average of the posterior distribution. This can be used as a Bayesian point estimate of the parameter, based on the data and the prior distribution. With a large dataset, the posterior mean will tend to be close to the maximum likelihood estimate. (In this case, the two are somewhat close; compare the posterior mean of -6.536 for the age-vocabulary covariance to the maximum likelihood estimate of -5.014 reported earlier.)

## Replicating Bayesian Analysis and Data Imputation Results

The multiple imputation and Bayesian estimation algorithms implemented in Amos make extensive use of a stream of random numbers that depends on an initial random number seed. The default behavior of Amos is to change the random number seed every time you perform Bayesian estimation, Bayesian data imputation, or stochastic regression data imputation. Consequently, when you try to replicate one of those analyses, you can expect to get slightly different results because of using a different random number seed.

If, for any reason, you need an exact replication of an earlier analysis, you can do so by starting with the same random number seed that was used in the earlier analysis.

## Examining the Current Seed

To find out what the current random number seed is or to change its value:

- From the menus, choose Tools > Seed Manager.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-74e0e34052.jpg)

By default, Amos increments the current random number seed by one for each invocation of a simulation method that makes use of random numbers (either Bayesian SEM, stochastic regression data imputation, or Bayesian data imputation). Amos maintains a log of previous seeds used, so it is possible to match the file creation dates of previously generated analysis results or imputed datasets with the dates reported in the Seed Manager.

## Changing the Current Seed

- Click Change and enter a previously used seed before performing an analysis.

Amos will use the same stream of random numbers that it used the last time it started out with that seed. For example, we used the Seed Manager to discover that Amos used a seed of 14942405 when the analysis for this example was performed. To generate the same Bayesian analysis results as we did:

- Click Change and change the current seed to 14942405.

The following figure shows the Seed Manager dialog after making the change:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-4582c684b9.jpg)

A more proactive approach is to select a fixed seed value prior to running a Bayesian or data imputation analysis. You can have Amos use the same seed value for all analyses if you select the Always use the same seed option.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-e2cf13a4c1.jpg)

Record the value of this seed in a safe place so that you can replicate the results of your analysis at a later date.

Tip: We use the same seed value of 14942405 for all examples in this guide so that you can reproduce our results.

We mentioned earlier that the MCMC algorithm used by Amos draws random values of parameters from high-dimensional joint posterior distributions via Monte Carlo simulation of the posterior distribution of parameters. For instance, the value reported in the Mean column is not the exact posterior mean but is an estimate obtained by averaging across the random samples produced by the MCMC procedure. It is important to have at least a rough idea of how much uncertainty in the posterior mean is attributable to Monte Carlo sampling.

The second column, labeled S.E., reports an estimated standard error that suggests how far the Monte-Carlo estimated posterior mean may lie from the true posterior mean. As the MCMC procedure continues to generate more samples, the estimate of the posterior mean becomes more precise, and the S.E. gradually drops. Note that this S.E. is not an estimate of how far the posterior mean may lie from the unknown true value of the parameter. That is, one would not use $\pm 2$ S.E. values as the width of a 95% interval for the parameter.

The likely distance between the posterior mean and the unknown true parameter is reported in the third column, labeled S.D., and that number is analogous to the standard error in maximum likelihood estimation. Additional columns contain the convergence statistic (C.S.), the median value of each parameter, the lower and upper 50% boundaries of the distribution of each parameter, and the skewness, kurtosis, minimum value, and maximum value of each parameter. The lower and upper $50 %$ boundaries are the endpoints of a $50 %$ Bayesian credible set, which is the Bayesian analogue of a $50 %$ confidence interval. Most of us are accustomed to using a confidence level of $95 %$, so we will soon show you how to change to $95 %$.

When you choose Analyze → Bayesian Estimation, the MCMC algorithm begins sampling immediately, and it continues until you click the Pause Sampling button to halt the process. In the figure on p. 409, sampling was halted after $500+5831=6331$ completed samples. Amos generated and discarded 500 burn-in samples prior to drawing the first sample that was retained for the analysis. Amos draws burn-in samples to allow the MCMC procedure to converge to the true joint posterior distribution. After Amos draws and discards the burn-in samples, it draws additional samples to give us a clear picture of what this joint posterior distribution looks like. In the example shown on p. 409, Amos has drawn 5,831 of these analysis samples, and it is upon these analysis samples that the results in the summary table are based. Actually, the displayed results are for 500 burn-in and 5,500 analysis samples. Because the sampling algorithm Amos uses is very fast, updating the summary table after each sample would lead to a rapid, incomprehensible blur of changing results in the Bayesian SEM window. It would also slow the analysis down. To avoid both problems, Amos refreshes the results after every 1,000 samples.

## Changing the Refresh Options

To change the refresh interval:

- From the menus, choose View > Options.
- Click the Refresh tab in the Options dialog to show the refresh options.


## Options

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-ba789105b0.jpg)

You can change the refresh interval to something other than the default of 1,000 observations. Alternatively, you can refresh the display at a regular time interval that you specify.

If you select Refresh the display manually, the display will never be updated automatically. Regardless of what you select on the Refresh tab, you can refresh the display manually at any time by clicking the Refresh button on the Bayesian SEM toolbar.

## Assessing Convergence

Are there enough samples to yield stable estimates of the parameters? Before addressing this question, let us briefly discuss what it means for the procedure to have converged. Convergence of an MCMC algorithm is quite different from convergence of a nonrandom method such as maximum likelihood. To properly understand MCMC convergence, we need to distinguish two different types.

The first type, which we may call convergence in distribution, means that the analysis samples are, in fact, being drawn from the actual joint posterior distribution of the parameters. Convergence in distribution takes place in the burn-in period, during which the algorithm gradually forgets its initial starting values. Because these samples may not be representative of the actual posterior distribution, they are discarded. The default burn-in period of 500 is quite conservative, much longer than needed for most problems. Once the burn-in period is over and Amos begins to collect the analysis samples, one may ask whether there are enough of these samples to accurately estimate the summary statistics, such as the posterior mean.

That question pertains to the second type of convergence, which we may call convergence of posterior summaries. Convergence of posterior summaries is complicated by the fact that the analysis samples are not independent but are actually an autocorrelated time series. The $1001^{\text {th }}$ sample is correlated with the $1000^{\text {th }}$, which, in turn, is correlated with the $999^{\text {th }}$, and so on. These correlations are an inherent feature of MCMC, and because of these correlations, the summary statistics from 5,500 (or whatever number of) analysis samples have more variability than they would if the 5,500 samples had been independent. Nevertheless, as we continue to accumulate more and more analysis samples, the posterior summaries gradually stabilize.

Amos provides several diagnostics that help you check convergence. Notice the value 1.0083 on the toolbar of the Bayesian SEM window on p. 409. This is an overall convergence statistic based on a measure suggested by Gelman et al. (2013). Each time the screen refreshes, Amos updates the C.S. for each parameter in the summary table; the C.S. value on the toolbar is the largest of the individual C.S. values. By default, Amos judges the procedure to have converged if the largest of the C.S. values is less than 1.002 . By this standard, the maximum C.S. of 1.0083 is not small enough. Amos displays an unhappy face (8) when the overall C.S. is not small enough. The C.S. compares the variability within parts of the analysis sample to the variability across these parts. A value of 1.000 represents perfect convergence, and larger values indicate that the posterior summaries can be made more precise by creating more analysis samples.

Example 26

Clicking the Pause Sampling button a second time instructs Amos to resume the sampling process. You can also pause and resume sampling by choosing Pause Sampling from the Analyze menu, or by using the keyboard combination Ctrl+E. The next figure shows the results after resuming the sampling for a while and pausing again.

| ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-af5879d72b.jpg) |  |  |  |  |  |  |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| File Edit View Analyze Help |  |  |  |  |  |  |  |  |  |  |  |  |
| ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-84ab2701d3.jpg) |  |  |  |  |  |  |  |  |  |  |  |  |
| II 5099 519 R $500+22,500$ |  |  |  |  |  |  |  |  |  |  |  |  |
| Group number 11. |  |  |  |  |  |  |  |  |  |  |  |  |
|  | Mean | S.E. | S.D. | C.S. | Median | 50% Lower | 50% Upper bound | Skewness | Kurtosis | Min | Max | Name |
| Means |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |
| age | 70.969 | 0.024 | 0.811 | 1.000 | 70.972 | 70.431 | 71.500 | -0.090 | 0.355 | 67.094 | 74.077 |  |
| vocabulary | 62.570 | 0.055 | 2.002 | 1.000 | 62.566 | 61.228 | 63.874 | -0.032 | 0.252 | 53.845 | 69.860 |  |
| Covariances |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |
| age<->vocabulary | -6.462 | 0.334 | 10.902 | 1.000 | -6.214 | -13.214 | 0.421 | -0.160 | 0.860 | -70.086 | 43.096 |  |
| Variances |  |  |  |  |  |  |  |  |  |  |  |  |
| age | 26.170 | 0.328 | 6.824 | 1.001 | 24.979 | 21.458 | 29.697 | 1.213 | 2.577 | 12.053 | 70.861 |  |
| vocabulary | 158.773 | 1.965 | 39.617 | 1.001 | 152.990 | 130.336 | 180.134 | 0.959 | 1.417 | 72.285 | 372.253 |  |

At this point, we have 22,501 analysis samples, although the display was most recently updated at the $22,500^{\text {th }}$ sample. The largest C.S. is 1.0012 , which is below the 1.002 criterion that indicates acceptable convergence. Reflecting the satisfactory convergence, Amos now displays a happy face . Gelman et al. (2013) suggest that, for many analyses, values of 1.10 or smaller are sufficient. The default criterion of 1.002 is conservative. Judging that the MCMC chain has converged by this criterion does not mean that the summary table will stop changing. The summary table will continue to change as long as the MCMC algorithm keeps running. As the overall C.S. value on the toolbar approaches 1.000 , however, there is not much more precision to be gained by taking additional samples, so we might as well stop.

## Diagnostic Plots

In addition to the C.S. value, Amos offers several plots that can help you check convergence of the Bayesian MCMC method. To view these plots:

From the menus, choose View > Posterior.
Amos displays the Posterior dialog.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-a133893578.jpg)

Example 26

- Select the age<->vocabulary parameter from the Bayesian SEM window.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-694487a3f0.jpg)

The Posterior dialog now displays a frequency polygon of the distribution of the agevocabulary covariance across the 22,500 samples.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-6f4430a5e1.jpg)

One visual aid you can use to judge whether it is likely that Amos has converged to the posterior distribution is a simultaneous display of two estimates of the distribution, one obtained from the first third of the accumulated samples and another obtained from the last third. To display the two estimates of the marginal posterior on the same graph:

- Select First and last. (A check mark will appear next to the option.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-c29fbccfd2.jpg)

In this example, the distributions of the first and last thirds of the analysis samples are almost identical, which suggests that Amos has successfully identified the important features of the posterior distribution of the age-vocabulary covariance. Note that this posterior distribution appears to be centered at some value near -6 , which agrees with the Mean value for this parameter. Visual inspection suggests that the standard deviation is roughly 10 , which agrees with the value of S.D.

Notice that more than half of the sampled values are to the left of 0 . This provides mild evidence that the true value of the covariance parameter is negative, but this result is not statistically significant because the proportion to the right of 0 is still quite large. If the proportion of sampled values to the right of 0 were very small-for example, less than $5 %$-then we would be able to reject the null hypothesis that the covariance parameter is greater than or equal to 0 . In this case, however, we cannot.

Another plot that helps in assessing convergence is the trace plot. The trace plot, sometimes called a time-series plot, shows the sampled values of a parameter over time. This plot helps you to judge how quickly the MCMC procedure converges in distribution-that is, how quickly it forgets its starting values.

- To view the trace plot, select Trace.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-82ba9c1d4c.jpg)

The plot shown here is quite ideal. It exhibits rapid up-and-down variation with no long-term trends or drifts. If we were to mentally break up this plot into a few horizontal sections, the trace within any section would not look much different from the trace in any other section. This indicates that the convergence in distribution takes place rapidly. Long-term trends or drifts in the plot indicate slower convergence. (Note that long-term is relative to the horizontal scale of this plot, which depends on the number of samples. As we take more samples, the trace plot gets squeezed together like an accordion, and slow drifts or trends eventually begin to look like rapid up-and-down variation.) The rapid up-and-down motion means that the sampled value at any iteration is unrelated to the sampled value $k$ iterations later, for values of $k$ that are small relative to the total number of samples.

To see how long it takes for the correlations among the samples to die down, we can examine a third plot, called an autocorrelation plot. This plot displays the estimated correlation between the sampled value at any iteration and the sampled value $k$ iterations later for $k=1,2,3, \ldots$.

- To display this plot, select Autocorrelation.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-c431889faf.jpg)

Lag, along the horizontal axis, refers to the spacing at which the correlation is estimated. In ordinary situations, we expect the autocorrelation coefficients to die down and become close to 0 , and remain near 0 , beyond a certain lag. In the autocorrelation plot shown above, the lag-10 correlation-the correlation between any sampled value and the value drawn 10 iterations later-is approximately 0.50 . The lag-35 correlation lies below 0.20 , and at lag 90 and beyond, the correlation is effectively 0 . This indicates that by 90 iterations, the MCMC procedure has essentially forgotten its starting position, at least as far as this covariance parameter is concerned. Forgetting the starting position is equivalent to convergence in distribution. If we were to examine the autocorrelation plots for the other parameters in the model, we would find that they also effectively die down to 0 by 90 or so iterations. This fact gives us confidence that a burn-in period of 500 samples was more than enough to ensure that convergence in distribution was attained, and that the analysis samples are indeed samples from the true posterior distribution.

In certain pathological situations, the MCMC procedure may converge very slowly or not at all. This may happen in data sets with high proportions of missing values, when the missing values fall in a peculiar pattern, or in models with some parameters that are poorly estimated. If this should happen, the trace plots for one or more parameters in the model will have long-term drifts or trends that do not diminish as more and more samples are taken. Even as the trace plot gets squeezed together like an accordion, the drifts and trends will not go away. In that case, you will probably see that the range of sampled values for the parameter (as indicated by the vertical scale of the trace plot, or by the S.D. or the difference between Min and Max in the Bayesian SEM window) is huge. The autocorrelations may remain high for large lags or may appear to oscillate between positive and negative values for a long time. When this happens, it suggests that the model is too complicated to be supported by the data at hand, and we ought to consider either fitting a simpler model or introducing information about the parameters by specifying a more informative prior distribution.

## Bivariate Marginal Posterior Plots

The summary table in the Bayesian SEM window and the frequency polygon in each Posterior dialog box describe the marginal posterior distributions of the estimands, one at a time. The marginal posterior distributions are very important, but they do not reveal relationships that may exist among the estimands. For example, two covariances or regression coefficients may share significance in the sense that either one could plausibly be 0 , but both cannot. To help us visualize the relationships among pairs of estimands, Amos provides bivariate marginal posterior plots.

- To display the marginal posterior of two parameters, begin by displaying the posterior distribution of one of the parameters (for example, the variance of age).
- Hold down the control (Ctrl) key on the keyboard and select the second parameter in the summary table (for example, the variance of vocabulary).

Amos then displays a three-dimensional surface plot of the marginal posterior distribution of the variances of age and vocabulary.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-fe32ccd82e.jpg)

- Select Histogram to display a similar plot using vertical blocks.
- Select Contour to display a two-dimensional plot of the bivariate posterior density.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-3438b5d7ee.jpg)

Ranging from dark to light, the three shades of gray represent $50 %, 90 %$, and $95 %$ credible regions, respectively. A credible region is conceptually similar to a bivariate confidence region that is familiar to most data analysts acquainted with classical statistical inference methods.

## Credible Intervals

Recall that the summary table in the Bayesian SEM window displays the lower and upper endpoints of a Bayesian credible interval for each estimand. By default, Amos presents a $50 %$ interval, which is similar to a conventional $50 %$ confidence interval.

Researchers often report 95% confidence intervals, so you may want to change the boundaries to correspond to a posterior probability content of $95 %$.

## Changing the Confidence Level

- Click the Display tab in the Options dialog box.
- Type 95 as the Confidence level value.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-cfed66b865.jpg)
- Click the Close button. Amos now displays 95% credible intervals.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-5919a7affd.jpg)


## Learning More about Bayesian Estimation

Gill (2004) provides a readable overview of Bayesian estimation and its advantages in a special issue of Political Analysis. Jackman (2000) offers a more technical treatment of the topic, with examples, in a journal article format. The book by Gelman et al. (2013) addresses a multitude of practical issues with numerous examples.

## Example <br> 27

## Bayesian Estimation Using a Non-Diffuse Prior Distribution

## Introduction

This example demonstrates using a non-diffuse prior distribution.

## About the Example

Example 26 showed how to perform Bayesian estimation for a simple model with the uniform prior distribution that Amos uses by default. In the present example, we consider a more complex model and make use of a non-diffuse prior distribution. In particular, the example shows how to specify a prior distribution so that we avoid negative variance estimates and other improper estimates.

## More about Bayesian Estimation

In the discussion of the previous example, we noted that Bayesian estimation depends on information supplied by the analyst in conjunction with data. Whereas maximum likelihood estimation maximizes the likelihood of an unknown parameter $\theta$ when given the observed data $\mathbf{y}$ through the relationship $L(\theta \mid \mathbf{y}) \propto p(\mathbf{y} \mid \theta)$, Bayesian estimation approximates the posterior density of $\mathbf{y}, p(\theta \mathrm{ly}) \propto p(\theta) L(\theta \mathrm{ly})$, where $p(\theta)$ is the prior distribution of $\theta$, and $p(\theta \mathrm{y})$ is the posterior density of $\theta$ given $\mathbf{y}$. Conceptually, this means that the posterior density of $\mathbf{y}$ given $\theta$ is the product of the prior distribution of $\theta$ and the likelihood of the observed data (Jackman, 2000, p. 377).

Example 27

As the sample size increases, the likelihood function becomes more and more tightly concentrated about the ML estimate. In that case, a diffuse prior tends to be nearly flat or constant over the region where the likelihood is high; the shape of the posterior distribution is largely determined by the likelihood, that is by the data themselves.

Under a uniform prior distribution for $\theta, p(\theta)$ is completely flat, and the posterior distribution is simply a re-normalized version of the likelihood. Even under a nonuniform prior distribution, the influence of the prior distribution diminishes as the sample size increases. Moreover, as the sample size increases, the joint posterior distribution for $\theta$ comes to resemble a normal distribution. For this reason, Bayesian and classical maximum likelihood analyses yield equivalent asymptotic results (Jackman, 2000). In smaller samples, if you can supply sensible prior information to the Bayesian procedure, the parameter estimates from a Bayesian analysis can be more precise. (The other side of the coin is that a bad prior can do harm by introducing bias.)

## Bayesian Analysis and Improper Solutions

One familiar problem in the fitting of latent variable models is the occurrence of improper solutions (Chen, Bollen, Paxton, Curran, and Kirby, 2001). An improper solution occurs, for example, when a variance estimate is negative. Such a solution is called improper because it is impossible for a variance to be less than 0 . An improper solution may indicate that the sample is too small or that the model is wrong. Bayesian estimation cannot help with a bad model, but it can be used to avoid improper solutions that result from the use of small samples. Martin and McDonald (1975), discussing Bayesian estimation for exploratory factor analysis, suggested that estimates can be improved and improper solutions can be avoided by choosing a prior distribution that assigns zero probability to improper solutions. The present example demonstrates Martin and McDonald's approach to avoiding improper solutions by a suitable choice of prior distribution.

## About the Data

Jamison and Scogin (1995) conducted an experimental study of the effectiveness of a new treatment for depression in which participants were asked to read and complete the homework exercises in Feeling Good: The New Mood Therapy (Burns, 1999, 2020). Jamison and Scogin randomly assigned participants to a control condition or an experimental condition, measured their levels of depression, treated the experimental group, and then re-measured participants' depression. The researchers did not rely on a single measure of depression. Instead, they used two well-known depression scales, the Beck Depression Inventory (Beck, 1967) and the Hamilton Rating Scale for Depression (Hamilton, 1960). We will call them BDI and HRSD for short. The data are in the file feelinggood.sav.

## Fitting a Model by Maximum Likelihood

The following figure shows the results of using maximum likelihood estimation to fit a model for the effect of treatment (COND) on depression at Time 2. Depression at Time 1 is used as a covariate. At Time 1 and then again at Time 2, $B D I$ and $H R S D$ are modeled as indicators of a single underlying variable, depression (DEPR).
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-be61963dd0.jpg)

Example 27

The path diagram for this model is in Ex27.amw. The chi-square statistic of 0.059 with one degree of freedom indicates a good fit, but the negative residual variance for posttherapy HRSD makes the solution improper.

## Bayesian Estimation with a Non-Informative (Diffuse) Prior

Does a Bayesian analysis with a diffuse prior distribution yield results similar to those of the maximum likelihood solution? To find out, we will do a Bayesian analysis of the same model. First, we will show how to increase the number of burn-in observations. This is just to show you how to do it. Nothing suggests that the default of 500 burn-in observations needs to be changed.

## Changing the Number of Burn-In Observations

To change the number of burn-in observations to 1,000 :

- From the menus, choose View > Options.
- In the Options dialog, select the MCMC tab.
- Change Number of burn-in observations to 1000 .
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-7b53b3f20a.jpg)
- Click Close and allow MCMC sampling to proceed until the unhappy face turns happy $\dot{( })$.


## Bayesian Estimation Using a Non-Diffuse Prior Distribution

The summary table should look something like this:

| ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-0c68ad6ead.jpg) |  |  |  |  |  |  |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| File Edit View Analyze Help![](https://ai-docs.amosdevelopment.com/Images/ug/ug-a2f2c572a5.jpg) <br> 598.142 (1,000+53,000)*4 <br> Group number 1 <br> CADUP INITIDER I : |  |  |  |  |  |  |  |  |  |  |  |  |
|  | Mean | S.E. | S.D. | C.S. | Median | $95 %$ Lower bound | $95 %$ Upper bound | Skewness | Kurtosis | Min | Max | Name |
| Regression weights |  |  |  |  |  |  |  |  |  |  |  |  |
| HRSD1<-DEPR1 | 0.525 | 0.006 | 0.168 | 1.001 | 0.515 | 0.229 | 0.886 | 0.436 | 0.328 | 0.107 | 1.399 |  |
| HRSD2<--DEPR2 | 0.919 | 0.003 | 0.105 | 1.000 | 0.907 | 0.745 | 1.154 | 0.681 | 0.833 | 0.617 | 1.464 |  |
| DEPR2<--COND | -10.383 | 0.043 | 1.595 | 1.000 | -10.390 | -13.484 | -7.241 | 0.010 | 0.010 | -16.973 | -4.704 |  |
| DEPR2<-DEPR1 | 0.599 | 0.007 | 0.197 | 1.001 | 0.593 | 0.238 | 1.010 | 0.285 | 0.083 | 0.045 | 1.409 |  |
| Means |  |  |  |  |  |  |  |  |  |  |  |  |
| COND | 0.502 | 0.001 | 0.058 | 1.000 | 0.502 | 0.388 | 0.613 | -0.045 | 0.051 | 0.211 | 0.729 |  |
| Intercepts |  |  |  |  |  |  |  |  |  |  |  |  |
| BDI1 | 21.705 | 0.013 | 0.784 | 1.000 | 21.696 | 20.189 | 23.280 | 0.062 | 0.034 | 18.846 | 24.966 |  |
| HRSD1 | 19.799 | 0.008 | 0.487 | 1.000 | 19.799 | 18.852 | 20.764 | 0.023 | 0.103 | 17.785 | 22.004 |  |
| BDI2 | 19.902 | 0.024 | 1.244 | 1.000 | 19.890 | 17.446 | 22.375 | 0.009 | 0.229 | 14.632 | 26.166 |  |
| HRSD2 | 19.192 | 0.015 | 0.820 | 1.000 | 19.177 | 17.587 | 20.801 | -0.003 | 0.141 | 15.507 | 23.146 |  |
| Covariances |  |  |  |  |  |  |  |  |  |  |  |  |
| COND<->DEPR1 | 0.344 | 0.009 | 0.406 | 1.000 | 0.335 | -0.429 | 1.181 | 0.147 | 0.240 | -1.212 | 2.203 |  |
| e2 $<->e 4$ | 14.123 | 0.146 | 5.250 | 1.000 | 13.738 | 4.805 | 25.660 | 0.487 | 0.688 | -2.313 | 40.992 |  |
| e3<->e5 | 1.222 | 0.068 | 2.668 | 1.000 | 1.333 | -4.247 | 6.063 | -0.515 | 1.760 | -15.785 | 11.580 |  |
| Variances |  |  |  |  |  |  |  |  |  |  |  |  |
| DEPR1 | 39.024 | 0.851 | 14.328 | 1.002 | 36.571 | 18.194 | 72.086 | 1.243 | 2.972 | 7.660 | 120.018 |  |
| COND | 0.275 | 0.001 | 0.046 | 1.000 | 0.271 | 0.200 | 0.377 | 0.594 | 0.554 | 0.146 | 0.544 |  |
| E1 | 27.067 | 0.103 | 6.706 | 1.000 | 26.260 | 16.333 | 42.591 | 0.787 | 1.145 | 8.732 | 71.053 |  |
| e2 | 12.180 | 0.783 | 13.675 | 1.002 | 14.405 | -20.664 | 33.097 | -1.205 | 3.036 | -66.065 | 56.616 |  |
| e3 | 9.506 | 0.142 | 3.805 | 1.001 | 9.471 | 1.863 | 17.038 | -0.048 | 0.335 | -6.715 | 25.733 |  |
| e4 | 32.888 | 0.285 | 8.330 | 1.001 | 32.048 | 18.868 | 51.593 | 0.703 | 1.157 | 6.626 | 78.513 |  |
| e5 | -3.880 | 0.219 | 5.256 | 1.001 | -3.324 | -15.421 | 4.898 | -0.922 | 2.654 | -36.545 | 14.380 |  |

In this analysis, we allowed Amos to reach its default limit of $100,000 \mathrm{MCMC}$ samples. When Amos reaches this limit, it begins a process known as thinning. Thinning involves retaining an equally-spaced subset of samples rather than all samples. Amos begins the MCMC sampling process by retaining all samples until the limit of 100,000 samples is reached. At that point, if the data analyst has not halted the sampling process, Amos discards half of the samples by removing every alternate one, so that the lag-1 dependence in the remaining sequence is the same as the lag-2 dependence of the original unthinned sequence. From that point, Amos continues the sampling process, keeping one sample out of every two that are generated, until the upper limit of 100,000 is again reached. At that point, Amos thins the sample a second time and begins keeping one new sample out of every four...and so on.

Why does Amos perform thinning? Thinning reduces the autocorrelation between successive samples, so a thinned sequence of 100,000 samples provides more information than an unthinned sequence of the same length. In the current example, the displayed results are based on 53,000 samples that were collected after 1,000 burn-in samples, for a total of 54,000 samples. However, this is after the sequence of samples has been thinned twice, so that four samples had to be generated for every one that was kept. If thinning had not been performed, there would have been $1,000 \times 8=8,000$ burn-in samples and $53,000 \times 8=424,000$ analysis samples.

The results of the Bayesian analysis are very similar to the maximum likelihood results. The posterior Mean for the residual variance of $e 5$ is negative, just as the maximum likelihood estimate is. The posterior distribution itself lies largely to the left of 0 .
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-a3cfa2d08e.jpg)

Fortunately, there is a remedy for this problem: Assign a prior density of 0 to any parameter vector for which the variance of $e 5$ is negative. To change the prior distribution of the variance of $e 5$ :

- From the menus, choose View > Prior.

Alternatively, click the Prior button m on the Bayesian SEM toolbar, or enter the keyboard combination Ctrl+R. Amos displays the Prior dialog.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-8b4cb47a5d.jpg)

- Select the variance of $e 5$ in the Bayesian SEM window to display the default prior distribution for e5.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-87b1e0faab.jpg)
- Replace the default lower bound of $-3.4 \times 10^{-38}$ with 0 .
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-5d6b68c405.jpg)
- Click Apply to save this change.

Example 27

Amos immediately discards the accumulated MCMC samples and begins sampling all over again. After a while, the Bayesian SEM window should look something like this:

|  |  |  |  |  |  |  |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| File Edit View Analyze Help![](https://ai-docs.amosdevelopment.com/Images/ug/ug-e180adc17f.jpg) |  |  |  |  |  |  |  |  |  |  |  |  |
|  | Mean | S.E. | S.D. | C.S. | Median | 95% Lower bound | 95% Upper bound | Skewness | Kurtosis | Min | Max | Name |
| Regression weights |  |  |  |  |  |  |  |  |  |  |  |  |
| HRSD1<-DEPR1 | 0.498 | 0.006 | 0.162 | 1.001 | 0.492 | 0.197 | 0.844 | 0.403 | 0.600 | 0.089 | 1.383 |  |
| HRSD2<--DEPR2 | 0.827 | 0.001 | 0.066 | 1.000 | 0.823 | 0.706 | 0.967 | 0.350 | 0.483 | 0.563 | 1.213 |  |
| DEPR2<--COND | -11.280 | 0.028 | 1.455 | 1.000 | -11.256 | -14.210 | -8.506 | -0.137 | 0.241 | -17.719 | -5.678 |  |
| DEPR2<--DEPR1 | 0.584 | 0.007 | 0.193 | 1.001 | 0.583 | 0.208 | 0.972 | 0.155 | 0.135 | 0.045 | 1.625 |  |
| Means |  |  |  |  |  |  |  |  |  |  |  |  |
| COND | 0.500 | 0.001 | 0.059 | 1.000 | 0.501 | 0.384 | 0.615 | -0.038 | 0.045 | 0.244 | 0.746 |  |
| Intercepts |  |  |  |  |  |  |  |  |  |  |  |  |
| BDI1 | 21.685 | 0.010 | 0.794 | 1.000 | 21.689 | 20.113 | 23.247 | 0.003 | 0.128 | 18.220 | 25.137 |  |
| HRSD1 | 19.799 | 0.007 | 0.495 | 1.000 | 19.803 | 18.825 | 20.760 | -0.047 | 0.121 | 17.558 | 21.918 |  |
| BDI2 | 20.325 | 0.018 | 1.162 | 1.000 | 20.321 | 18.055 | 22.607 | 0.021 | 0.037 | 15.499 | 25.067 |  |
| HRSD2 | 19.110 | 0.013 | 0.809 | 1.000 | 19.118 | 17.521 | 20.667 | -0.021 | 0.069 | 15.709 | 22.370 |  |
| Covariances |  |  |  |  |  |  |  |  |  |  |  |  |
| COND<->DEPR1 | 0.272 | 0.006 | 0.414 | 1.000 | 0.260 | -0.531 | 1.127 | 0.134 | 0.391 | -1.496 | 2.400 |  |
| e2<->e4 | 10.331 | 0.099 | 4.434 | 1.000 | 10.105 | 2.249 | 19.714 | 0.316 | 0.692 | -8.257 | 38.711 |  |
| e3<->e5 | 3.012 | 0.061 | 2.189 | 1.000 | 2.940 | -1.139 | 7.557 | 0.194 | 0.425 | -7.271 | 13.210 |  |
| Variances |  |  |  |  |  |  |  |  |  |  |  |  |
| DEPR1 | 41.274 | 1.011 | 16.800 | 1.002 | 37.954 | 18.327 | 86.251 | 1.353 | 2.472 | 3.877 | 123.457 |  |
| COND | 0.274 | 0.001 | 0.047 | 1.000 | 0.269 | 0.198 | 0.381 | 0.724 | 0.932 | 0.133 | 0.552 |  |
| E1 | 27.240 | 0.138 | 6.447 | 1.000 | 26.492 | 16.626 | 41.876 | 0.649 | 0.644 | 6.812 | 62.578 |  |
| e2 | 8.375 | 1.004 | 16.031 | 1.002 | 11.618 | -37.763 | 30.435 | -1.526 | 3.207 | -74.182 | 50.045 |  |
| e3 | 10.372 | 0.133 | 3.625 | 1.001 | 10.286 | 3.341 | 17.530 | 0.070 | 0.614 | -8.973 | 25.956 |  |
| e4 | 25.173 | 0.070 | 5.341 | 1.000 | 24.855 | 15.548 | 36.595 | 0.396 | 0.717 | 4.618 | 63.297 |  |
| e5 | 2.380 | 0.021 | 1.999 | 1.000 | 1.874 | 0.083 | 7.411 | 1.377 | 2.592 | 0.000 | 18.708 |  |

## Bayesian Estimation Using a Non-Diffuse Prior Distribution

The posterior mean of the variance of $e 5$ is now positive. Examining its posterior distribution confirms that no sampled values fall below 0 .
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-b311a63a38.jpg)

Is this solution proper? The posterior mean of each variance is positive, but a glance at the Min column shows that some of the sampled values for the variance of $e 2$ and the variance of $e 3$ are negative. To avoid negative variances for $e 2$ and $e 3$, we can modify their prior distributions just as we did for $e 5$.

It is not too difficult to impose such constraints on a parameter-by-parameter basis in small models like this one. However, there is also a way to automatically set the prior density to 0 for any parameter values that are improper. To use this feature:

- From the menus, choose View > Options.
- In the Options dialog, click the Prior tab.
- Select Admissibility test. (A check mark will appear next to it.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-52f5aff5a7.jpg)

Selecting Admissibility test sets the prior density to 0 for parameter values that result in a model where any covariance matrix fails to be positive definite. In particular, the prior density is set to 0 for non-positive variances.

Amos also provides a stability test option that works much like the admissibility test option. Selecting Stability test sets the prior density to 0 for parameter values that result in an unstable system of linear equations.

As soon as you select Admissibility test, the MCMC sampling starts all over, discarding any previously accumulated samples. After a short time, the results should look something like this:

| ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-42e5401af1.jpg) |  |  |  |  |  |  |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| File Edit View Analyze Help![](https://ai-docs.amosdevelopment.com/Images/ug/ug-4dd154c463.jpg) <br> Group number 1 |  |  |  |  |  |  |  |  |  |  |  |  |
|  | Mean <br> S.E. |  | S.D. | C.S. | Median | 95% Lower bound | 95% Upper bound | Skewness | Kurtosis | Min | Max | Name |
| Regression weights |  |  |  |  |  |  |  |  |  |  |  |  |
| HRSD1<-DEPR1 | 0.574 | 0.007 | 0.137 | 1.001 | 0.555 | 0.358 | 0.899 | 0.728 | 0.481 | 0.199 | 1.149 |  |
| HRSD2<--DEPR2 | 0.816 | 0.002 | 0.063 | 1.001 | 0.812 | 0.699 | 0.955 | 0.285 | 0.213 | 0.574 | 1.089 |  |
| DEPR2<--COND | -11.304 | 0.069 | 1.396 | 1.001 | -11.261 | -14.138 | -8.588 | -0.094 | 0.210 | -18.362 | -6.380 |  |
| DEPR2<-DEPR1 | 0.674 | 0.008 | 0.176 | 1.001 | 0.662 | 0.373 | 1.056 | 0.606 | 1.097 | 0.113 | 1.512 |  |
| Means |  |  |  |  |  |  |  |  |  |  |  |  |
| COND <br> Intercepts | 0.496 | 0.002 | 0.059 | 1.001 | 0.496 | 0.381 | 0.612 | 0.037 | -0.065 | 0.302 | 0.711 |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |
| BDI1 | 21.635 | 0.037 | 0.808 | 1.001 | 21.624 | 20.041 | 23.206 | -0.085 | 0.032 | 18.378 | 24.665 |  |
|  | 19.786 | 0.016 | 0.472 | 1.001 | 19.787 | 18.848 | 20.730 | 0.011 | 0.094 | 17.924 | 21.653 |  |
| BDI2 | 20.334 | 0.056 | 1.153 | 1.001 | 20.304 | 18.088 | 22.612 | 0.024 | -0.123 | 16.207 | 24.285 |  |
| HRSD2 | 19.039 | 0.036 | 0.809 | 1.001 | 19.056 | 17.443 | 20.575 | -0.063 | -0.105 | 16.054 | 21.890 |  |
| Covariances |  |  |  |  |  |  |  |  |  |  |  |  |
| COND<->DEPR1 | 0.262 | 0.021 | 0.382 | 1.001 | 0.260 | -0.457 | 1.057 | 0.135 | 0.233 | -1.210 | 1.726 |  |
| e2<->e4 | 11.139 | 0.213 | 4.354 | 1.001 | 10.854 | 3.393 | 20.485 | 0.383 | 0.529 | -2.302 | 31.841 |  |
| e3<->e5 | 2.235 | 0.102 | 1.813 | 1.002 | 2.108 | -1.028 | 6.195 | 0.364 | 0.248 | -3.058 | 10.431 |  |
| Variances |  |  |  |  |  |  |  |  |  |  |  |  |
| DEPR1 | 33.449 | 0.549 | 9.634 | 1.002 | 32.785 | 16.720 | 54.667 | 0.421 | 0.112 | 7.654 | 68.491 |  |
| COND | 0.272 | 0.002 | 0.047 | 1.001 | 0.267 | 0.196 | 0.379 | 0.701 | 0.981 | 0.140 | 0.527 |  |
| E1 | 26.195 | 0.237 | 6.195 | 1.001 | 25.553 | 15.988 | 40.134 | 0.685 | 0.870 | 10.870 | 58.234 |  |
| e2 | 16.532 | 0.340 | 7.577 | 1.001 | 16.049 | 3.349 | 32.737 | 0.402 | 0.078 | 0.113 | 53.330 |  |
| e3 | 8.694 | 0.164 | 2.859 | 1.002 | 8.787 | 2.809 | 14.023 | -0.049 | 0.395 | 0.004 | 23.794 |  |
|  | 24.459 | 0.217 | 5.286 | 1.001 | 24.232 | 14.846 | 36.060 | 0.420 | 0.935 | 3.736 | 55.246 |  |
| e5 | 2.755 | 0.105 | 2.025 | 1.001 | 2.323 | 0.282 | 7.964 | 1.467 | 3.069 | 0.006 | 15.824 |  |

Notice that the analysis took only 73,000 observations to meet the convergence criterion for all estimands. Minimum values for all estimated variances are now positive.

## Example 28

## Bayesian Estimation of Values Other Than Model Parameters

## Introduction

This example shows how to estimate other quantities besides model parameters in a Bayesian analyses.

## About the Example

Examples 26 and 27 demonstrated Bayesian analysis. In both of those examples, we were concerned exclusively with estimating model parameters. We may also be interested in estimating other quantities that are functions of the model parameters. For instance, one of the most common uses of structural equation modeling is the simultaneous estimation of direct and indirect effects. In this example, we demonstrate how to estimate the posterior distribution of an indirect effect.

## The Wheaton Data Revisited

In Example 6, we profiled the Wheaton et al. (1977) alienation data and described three alternative models for the data. Here, we re-examine Model C from Example 6. The following path diagram is in the file Ex28.amw:

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-3e0edb2efc.jpg)
Example 28:
Bayesian Estimation
Wheaton (1977)
Model C
Model Specification

$$
\begin{array}{r}
\text { Chi-square }=\lfloor\mathrm{cmin} \\
d \mathrm{c}=\lfloor\mathrm{df} \\
p=\backslash p
\end{array}
$$

## Indirect Effects

Suppose we are interested in the indirect effect of ses on alienation 71 through the mediation of alienation67. In other words, we suspect that socioeconomic status exerts an impact on alienation in 1967, which in turn influences alienation in 1971.

## Estimating Indirect Effects

- Before starting the Bayesian analysis, from the menus in Amos Graphics, choose View > Analysis Properties.
- In the Analysis Properties dialog, click the Output tab.
- Select Indirect, direct \& total effects and Standardized estimates to estimate standardized indirect effects. (A check mark will appear next to these options.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-7f36346d78.jpg)
- Close the Analysis Properties dialog.

Example 28

- From the menus, choose Analyze $>$ Calculate Estimates to obtain the maximum likelihood chi-square test of model fit and the parameter estimates.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-9e2564ccdc.jpg)
Example 28:
Bayesian Estimation
Wheaton (1977)
Model C
Standardized estimates

$$
\begin{array}{r}
\text { Chi-square }=7.50 \\
d f=8 \\
p=.48
\end{array}
$$

The results are identical to those shown in Example 6, Model C. The standardized direct effect of ses on alienation 71 is -0.19 . The standardized indirect effect ses on alienation 71 is defined as the product of two standardized direct effects: the standardized direct effect of ses on alienation67 ( -0.56 ) and the standardized direct effect of alienation67 on alienation71 (0.58). The product of these two standardized direct effects is $-0.56 \times 0.58=-0.32$.

You do not have to work the standardized indirect effect out by hand. To view all the standardized indirect effects:

## Bayesian Estimation of Values Other Than Model Parameters

- From the menus, choose View > Text Output.
- In the upper left corner of the Amos Output window, select Estimates, then Matrices, and then Standardized Indirect Effects.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-b945f83199.jpg)


## Bayesian Analysis of Model C

To begin Bayesian estimation for Model C:

- From the menus, choose Analyze > Bayesian Estimation.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-21aaf6e2aa.jpg)

The MCMC algorithm converges quite rapidly within $22,000 \mathrm{MCMC}$ samples.

## Bayesian Estimation of Values Other Than Model Parameters

## Additional Estimands

The summary table displays results for model parameters only. To estimate the posterior of quantities derived from the model parameters, such as indirect effects:

- From the menus, choose View > Additional Estimands.

Estimating the marginal posterior distribution of the additional estimands may take a while. A status window keeps you informed of progress.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-2389b3a479.jpg)

Results are displayed in the Additional Estimands window. To display the posterior mean for each standardized indirect effect:

- Select Standardized Indirect Effects and Mean in the panel at the left side of the window.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-c3d605ec0b.jpg)
- To print the results, select the items you want to print. (A check mark will appear next to them).
- From the menus, choose File > Print.

Be careful because it is possible to generate a lot of printed output. If you put a check mark in every check box in this example, the program will print $1 \times 8 \times 11=88$ matrices.

- To view the posterior means of the standardized direct effects, select Standardized Direct Effects and Mean in the panel at the left.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-26346f5f60.jpg)

The posterior means of the standardized direct and indirect effects of socioeconomic status on alienation in 1971 are almost identical to the maximum likelihood estimates.

## Inferences about Indirect Effects

There are two methods for finding a confidence interval for an indirect effect or for testing an indirect effect for significance. Sobel $(1982,1986)$ gives a method that assumes that the indirect effect is normally distributed. A growing body of statistical simulation literature calls into question this assumption, however, and advocates the use of the bootstrap to construct better, typically asymmetric, confidence intervals (MacKinnon, Lockwood, and Williams, 2004; Shrout and Bolger, 2002). These studies have found that the bias-corrected bootstrap confidence intervals available in Amos produce reliable inferences for indirect effects.

Example 28

As an alternative to the Sobel method and the bootstrap for finding confidence intervals, Amos can provide (typically asymmetric) credible intervals for standardized or unstandardized indirect effects. The next figure shows the lower boundary of a 95% credible interval for each standardized indirect effect in the model. Notice that $95 %$ Lower bound is selected in the panel at the left of the Additional Estimands window. (You can specify a value other than $95 %$ in the Bayesian Sem Options dialog.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-4e68576ec8.jpg)

## Bayesian Estimation of Values Other Than Model Parameters

The lower boundary of the $95 %$ credible interval for the indirect effect of socioeconomic status on alienation in 1971 is -0.382 . The corresponding upper boundary value is -0.270 , as shown below:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-0d9edb2af2.jpg)

We are now $95 %$ certain that the true value of this standardized indirect effect lies between -0.382 and -0.270 . To view the posterior distribution:

- From the menus in the Additional Estimands window, choose View > Posterior.

At first, Amos displays an empty posterior window.

File Edit View Help
Please click an estimand to view its posterior distribution

- Select Mean and Standardized Indirect Effects in the Additional Estimands window.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-663e6befba.jpg)

Amos then displays the posterior distribution of the indirect effect of socioeconomic status on alienation in 1971. The distribution of the indirect effect is approximately, but not exactly, normal.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-b616ce76d6.jpg)

## Example <br> 29

## Estimating a User-Defined Quantity in Bayesian SEM

## Introduction

This example shows how to estimate a user-defined quantity: in this case, the difference between a direct effect and an indirect effect.

## About the Example

In the previous example, we showed how to use the Additional Estimands feature of Amos Bayesian analysis to estimate an indirect effect. Suppose you wanted to carry the analysis a step further and address a commonly asked research question: How does an indirect effect compare to the corresponding direct effect?

## The Stability of Alienation Model

You can use the Custom Estimands feature of Amos to estimate and draw inferences about an arbitrary function of the model parameters. To illustrate the Custom Estimands feature, let us revisit the previous example. The path diagram for the model is shown on p. 459 and can be found in the file Ex29.amw. The model allows socioeconomic status to exert a direct effect on alienation experienced in 1971. It also allows an indirect effect that is mediated by alienation experienced in 1967.

The remainder of this example focuses on the direct effect, the indirect effect, and a comparison of the two. Notice that we supplied parameter labels for the direct effect ( " $c$ ") and the two components of the indirect effect ( " $a$ " and " $b$ "). Although not required, parameter labels make it easier to specify custom estimands.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-9a7c759de8.jpg)

To begin a Bayesian analysis of this model:

- From the menus, choose Analyze > Bayesian Estimation.

After a while, the Bayesian SEM window should look something like this:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-7286f0cf91.jpg)

Example 29

- From the menus, choose View > Additional Estimands.
- In the Additional Estimands window, select Standardized Direct Effects and Mean.

The posterior mean for the direct effect of ses on alienation 71 is -0.195 .
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-ea8f9c0248.jpg)

- Select Standardized Indirect Effects and Mean.

The indirect effect of socioeconomic status on alienation in 1971 is -0.320 .
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-c2c1080115.jpg)

The posterior distribution of the indirect effect lies entirely to the left of 0 , so we are practically certain that the indirect effect is less than 0 .
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-cf3a2b445b.jpg)

You can also display the posterior distribution of the direct effect. The program does not, however, have any built-in way to examine the posterior distribution of the difference between the indirect effect and the direct effect (or perhaps their ratio). This is a case of wanting to estimate and draw inferences about a quantity that the developers of the program did not anticipate. For this, you need to extend the capabilities of Amos by defining your own custom estimand.

## Numeric Custom Estimands

In this section, we show how to write a Visual Basic program for estimating the numeric difference between a direct effect and an indirect effect. (You can use $\mathrm{C} \#$ instead of Visual Basic.) The final Visual Basic program is in the file Ex29.vb.

The first step in writing a program to define a custom estimand is to open the custom estimands window.

From the menus on the Bayesian SEM window, choose View > Custom estimands.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-7fb71951df.jpg)

This window displays a skeleton Visual Basic program to which we will add lines to define the new quantities that we want Amos to estimate.

Note: If you want to use C\# instead of Visual Basic, from the menus, choose File > New Estimands (C\#).

The skeleton program contains a subroutine and a function. You have no control over when the subroutine and the function are called. They are called by Amos.

- Amos calls your DeclareEstimands subroutine once to find out how many new quantities (estimands) you want to estimate and what you want to call them.
- Amos calls your CalculateEstimands function repeatedly. Each time your CalculateEstimands function is called, it has to calculate the value of your custom estimands for a given set of parameter values.

In the subroutine DeclareEstimands, you need to replace the placeholder "Your code goes here" with lines that specify how many new quantities you want to estimate and what you want to call them. For this example, we want to estimate the difference between the direct effect of ses on alienation 71 and the corresponding indirect effect. We will also write code for computing the direct effect and the indirect effect individually. To define each estimand, we use the keyword newestimand, as shown below:

```
'Header'
Public Class CEstimand
    Implements IEstimand
    Public Sub DeclareEstimands() Implements IEstimand.DeclareEstimands
        newestimand("direct")
        newestimand("indirect")
        newestimand("difference")
    End Sub
    Public Function CalculateEstimands(sem As AmosEngine) As String Implements
IEstimand.CalculateEstimands
        'Your code goes here.
        Return "" 'Return an empty string if no error occurred
    End Function
End Class
```

The words "direct", "indirect", and "difference" are estimand labels. You can use different labels.

In the function CalculateEstimands, the placeholder "Your code goes here" needs to be replaced with lines for evaluating the estimands called "direct", "indirect" and "difference". We start by writing Visual Basic code for computing the direct effect. In the following figure, we have already typed part of a Visual Basic statement: estimand("direct") .value =.

```
'Header'
Public Class CEstimand
    Implements IEstimand
    Public Sub DeclareEstimands() Implements IEstimand.DeclareEstimands
        newestimand("direct")
        newestimand("indirect")
        newestimand("difference")
    End Sub
    Public Function CalculateEstimands(sem As AmosEngine) As String Implements
IEstimand.CalculateEstimands
        estimand("direct").value =,
        Return "" 'Return an empty string if no error occurred
    End Function
End Class
```

We need to finish the statement by adding additional code to the right of the equals (=) sign, describing how to compute the direct effect. The direct effect is to be calculated for a set of parameter values that are accessible through the AmosEngine object that is supplied as an argument to the CalculateEstimands function. Unless you are an expert Amos programmer, you would not know how to use the AmosEngine object; however, there is an easy way to get the needed Visual Basic syntax by dragging and dropping.

## Dragging and Dropping

- Find the direct effect in the Bayesian SEM window and click to select its row. (Its row is highlighted in the following figure.)
- Move the mouse pointer to an edge of the selected row. Either the top edge or the bottom edge will do.

Tip: When you get the mouse pointer on the right spot, a plus ( + ) symbol will appear next to the mouse pointer.

## Example 29

| - Bayesian SEM <br> -\|a\|x\| |  |  |  |  |  |  |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
|  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |
|  | Mean | S.E. | S.D. | C.S. | Median | 95% Lower bound | 95% Upper bound | Skewness | Kurtosis | Min | Max | Name |
| Regression weights |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |
| powles 71<--alienation71 | 1.000 | 0.002 | 0.041 | 1.001 | 0.998 | 0.926 | 1.086 | 0.229 | -0.103 | 0.869 | 1.165 | path_p |
| alienation71<--alienation67 | 0.604 | 0.002 | 0.046 | 1.001 | 0.603 | 0.516 | 0.693 | 0.063 | -0.091 | 0.445 | 0.761 | b |
| alienation $71<-$ ses | -0.206 | 0.002 | 0.049 | 1.001 | -0.204 | -0.306 | -0.117 | -0.245 | 0.181 | -0.414 | -0.070 | c |
| alienation $67<-$-ses | -0.561 | 0.002 | 0.054 | 1.001 | -0.560 | -0.671 | -0.460 | -0.066 | -0.053 | -0.766 | -0.374 | a $\mathrm{F}^{+}$ |
| SEl<--ses | 5.201 | 0.020 | 0.433 | 1.001 | 5.194 | 4.378 | 6.061 | 0.129 | 0.175 | 3.735 | 7.053 |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |
| Intercepts |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |
| anomia67 | 13.610 | 0.006 | 0.113 | 1.001 | 13.613 | 13.373 | 13.827 | -0.122 | -0.121 | 13.233 | 13.972 |  |
| powles 67 | 14.760 | 0.004 | 0.106 | 1.001 | 14.760 | 14.549 | 14.965 | -0.060 | -0.269 | 14.431 | 15.110 |  |
| anomia71 | 14.132 | 0.005 | 0.118 | 1.001 | 14.135 | 13.896 | 14.361 | -0.105 | -0.229 | 13.683 | 14.522 |  |
| powles 71 | 14.896 | 0.004 | 0.104 | 1.001 | 14.899 | 14.687 | 15.099 | -0.039 | -0.155 | 14.517 | 15.277 |  |
| education | 10.898 | 0.005 | 0.100 | 1.001 | 10.900 | 10.700 | 11.093 | -0.020 | -0.018 | 10.438 | 11.250 |  |
| SEI | 37.486 | 0.034 | 0.690 | 1.001 | 37.492 | 36.145 | 38.820 | -0.025 | -0.147 | 35.112 | 39.925 |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |
| Covariances |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |
| eps1<->eps3 | 1.889 | 0.012 | 0.253 | 1.001 | 1.888 | 1.392 | 2.378 | -0.002 | 0.164 | 1.015 | 2.911 |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |
| Variances <br> Variances |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |
| eps1 | 4.976 | 0.015 | 0.299 | 1.001 | 4.964 | 4.403 | 5.605 | 0.127 | 0.156 | 3.962 | 6.082 | var_a |
| eps2 | 2.454 | 0.010 | 0.228 | 1.001 | 2.453 | 1.997 | 2.898 | -0.006 | -0.035 | 1.586 | 3.339 | var_p |
| ses | 6.857 | 0.030 | 0.676 | 1.001 | 6.825 | 5.587 | 8.285 | 0.200 | -0.113 | 4.534 | 9.483 |  |
| zeta1 | 4.847 | 0.024 | 0.432 | 1.002 | 4.827 | 4.052 | 5.751 | 0.178 | -0.150 | 3.445 | 6.307 |  |
| zeta2 | 3.833 | 0.018 | 0.328 | 1.002 | 3.818 | 3.233 | 4.527 | 0.281 | 0.262 | 2.811 | 5.443 |  |
| delta1 | 2.775 | 0.024 | 0.518 | 1.001 | 2.788 | 1.729 | 3.763 | -0.149 | -0.052 | 1.016 | 4.610 |  |
| delta2 | 267.466 | 0.807 | 17.531 | 1.001 | 267.373 | 235.028 | 301.055 | 0.066 | -0.134 | 209.770 | 325.935 |  |

## Estimating a User-Defined Quantity in Bayesian SEM

- Hold down the left mouse button, drag the mouse pointer into the Visual Basic window to the spot where you want the expression for the direct effect to go, and release the mouse button.

When you complete this operation, Amos fills in the appropriate parameter expression, as shown in the next figure:

Public Function CalculateEstimands(sem As AmosEngine) As String Implements IEstimand.CalculateEstimands estimand("direct").value = sem.ParameterValue("c") Return "" 'Return an empty string if no error occurred End Function

End Class
The parameter on the right side of the equation is identified by the label (" $c$ ") that was used in the path diagram shown earlier.

We next turn our attention to calculating the indirect effect of socioeconomic status on alienation in 1971. This indirect effect is defined as the product of its two direct effects, the direct effect of socioeconomic status on alienation in 1967 (parameter a) and the direct effect of alienation in 1967 on alienation in 1971 (parameter $b$ ).

- On the left side of the Visual Basic assignment statement for computing the indirect effect, type estimand("indirect") .value $=$.
'Header'
Public Class CEstimand
Implements IEstimand
Public Sub DeclareEstimands() Implements IEstimand.DeclareEstimands newestimand("direct")
newestimand("indirect")
newestimand("difference")
End Sub
Public Function CalculateEstimands(sem As AmosEngine) As String Implements IEstimand.CalculateEstimands
estimand("direct").value = sem.ParameterValue("c")
estimand("indirect").value =
Return "" 'Return an empty string if no error occurred
End Function
End Class
Using the same drag-and-drop process as previously described, start dragging things from the Bayesian SEM window to the Unnamed.vb window.
- First, drag the direct effect of socioeconomic status on alienation in 1967 to the right side of the equals sign in the unfinished statement.

```
'Header'
Public Class CEstimand
    Implements IEstimand
    Public Sub DeclareEstimands() Implements IEstimand.DeclareEstimands
        newestimand("direct")
        newestimand("indirect")
        newestimand("difference")
    End Sub
    Public Function CalculateEstimands(sem As AmosEngine) As String Implements
IEstimand.CalculateEstimands
        estimand("direct").value = sem.ParameterValue("c")
        estimand("indirect").value = sem.ParameterValue("a")
        Return "" 'Return an empty string if no error occurred
    End Function
End Class
```


## Estimating a User-Defined Quantity in Bayesian SEM

- Next, drag and drop the direct effect of 1967 alienation on 1971 alienation.

This second direct effect appears in the Unnamed.vb window as sem.ParameterValue("b").

```
"Header"
```

Public Class CEstimand
Implements IEstimand

Public Sub DeclareEstimands() Implements IEstimand.DeclareEstimands newestimand("direct")
newestimand("indirect")
newestimand("difference")
End Sub
Public Function CalculateEstimands(sem As AmosEngine) As String Implements IEstimand.CalculateEstimands
estimand("direct").value = sem.ParameterValue("c")
estimand("indirect").value = sem.ParameterValue("a")sem.ParameterValue("b")
Return "" 'Return an empty string if no error occurred
End Function
End Class

Finally, use the keyboard to insert an asterisk (*) between the two parameter values.
'Header'
Public Class CEstimand
Implements IEstimand
Public Sub DeclareEstimands() Implements IEstimand.DeclareEstimands
newestimand("direct")
newestimand("indirect")
newestimand("difference")
End Sub
Public Function CalculateEstimands(sem As AmosEngine) As String Implements IEstimand.CalculateEstimands
estimand("direct").value = sem.ParameterValue("c")
estimand("indirect").value = sem.ParameterValue("a") * sem.ParameterValue("b")
Return "" 'Return an empty string if no error occurred
End Function
End Class

Hint: For complicated custom estimands, you can also drag and drop from the Additional Estimands window to the Custom Estimands window.

To compute the difference between the direct and indirect effects, add a third line of Visual Basic syntax, as seen in the following figure:

```
'Header's
Public Class CEstimand
    Implements IEstimand
    Public Sub DeclareEstimands() Implements IEstimand.DeclareEstimands
        newestimand("direct")
        newestimand("indirect")
        newestimand("difference")
    End Sub
    Public Function CalculateEstimands(ByVal sem As AmosEngine) As String Implements
IEstimand.CalculateEstimands
        estimand("direct").value = sem.ParameterValue("c")
        estimand("indirect").value = sem.ParameterValue("a") * sem.ParameterValue("b")
        estimand("difference").value = estimand("indirect").value - estimand("direct").value
        Return "" 'Return an empty string if no error occurred
    End Function
End Class
```

- To find the posterior distribution of all three custom estimands, click File > Run (or click the Run button on the toolbar).

The results will take a few seconds. A status window keeps you informed of progress.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-04b56f373b.jpg)

The marginal posterior distributions of the three custom estimands are summarized in the following table:

| 理 Custom Estimands |  |  |  |  |  |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| File Edit View Help![](https://ai-docs.amosdevelopment.com/Images/ug/ug-df8f4ba231.jpg) |  |  |  |  |  |  |  |  |  |  |  |
| Numeric Estimands | Mean | S.E. | S.D. | C.S. | Median | $95 %$ Lower bound | $95 %$ Upper bound | Skewness | Kurtosis | Min | Max |
| direct | -0.206 | 0.002 | 0.049 | 1.001 | -0.204 | -0.306 | -0.117 | -0.245 | 0.181 | -0.414 | -0.070 |
| indirect | -0.339 | 0.002 | 0.039 | 1.001 | -0.337 | -0.419 | -0.268 | -0.284 | 0.224 | -0.503 | -0.206 |
| difference | -0.132 | 0.003 | 0.070 | 1.001 | -0.130 | -0.272 | 0.000 | -0.111 | 0.180 | -0.412 | 0.111 |

The results for direct can also be found in the Bayesian SEM summary table, and the results for indirect can be found in the Additional Estimands table. We are really interested in difference. Its posterior mean is -0.132 . Its minimum is -0.412 , and its maximum is 0.111 .

- To see its marginal posterior, from the menus, choose View > Posterior.
- Select the difference row in the Custom Estimands table.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-6f70e9ee68.jpg)

Most of the area lies to the left of 0 , meaning that the difference is almost sure to be negative. In other words, it is almost certain that the indirect effect is more negative than the direct effect. Eyeballing the posterior, perhaps 95% or so of the area lies to the left of 0, so there is about a 95% chance that the indirect effect is larger than the direct effect. It is not necessary to rely on eyeballing the posterior, however. There is a way to find any area under a marginal posterior or, more generally, to estimate the probability that any proposition about the parameters is true.

## Dichotomous Custom Estimands

Visual inspection of the frequency polygon reveals that the majority of difference values are negative, but it does not tell us exactly what proportion of values are negative. That proportion is our estimate of the probability that the indirect effect exceeds the direct effect. To estimate probabilities like these, we can use dichotomous estimands. In Visual Basic (or C\#) programs, dichotomous estimands are just like numeric estimands except that dichotomous estimands take on only two values: true and false. In order to estimate the probability that the indirect effect is more negative than the direct effect, we need to define a function of the model parameters that is true when the indirect effect is more negative than the direct effect and is false otherwise.

## Defining a Dichotomous Estimand

- Name each dichotomous estimand in the DeclareEstimands subroutine. For purposes of illustration, we will declare two dichotomous estimands, calling them "indirect is less than zero" and "indirect is smaller than direct".

```
'Header'
Public Class CEstimand
    Implements IEstimand
    Public Sub DeclareEstimands() Implements IEstimand.DeclareEstimands
        newestimand("direct")
        newestimand("indirect")
        newestimand("difference")
        newestimand("indirect is less than zero")
        newestimand("indirect is smaller than direct")
    End Sub
    Public Function CalculateEstimands(ByVal sem As AmosEngine) As String Implements
IEstimand.CalculateEstimands
        estimand("direct").value = sem.ParameterValue("c")
        estimand("indirect").value = sem.ParameterValue("a") * sem.ParameterValue("b")
        estimand("difference").value = estimand("indirect").value - estimand("direct").value
        Return "" 'Return an empty string if no error occurred
    End Function
End Class
```

- Add lines to the CalculateEstimands function specifying how to compute them.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-8bdc84ffd4.jpg)

In this example, the first dichotomous custom estimand is true when the value of the indirect effect is less than 0 . The second dichotomous custom estimand is true when the indirect effect is smaller than the direct effect.

## Estimating a User-Defined Quantity in Bayesian SEM

- Click File $>$ Run (or click the Run button on the toolbar).

Amos evaluates the truth of each logical expression for each MCMC sample drawn. When the analysis finishes, Amos reports the proportion of MCMC samples in which each expression was found to be true. These proportions appear in the Dichotomous Estimands section of the Custom Estimands summary table.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-6f0f530876.jpg)

The $P$ column shows the proportion of times that each evaluated expression was true in the whole series of MCMC samples. In this example, the number of MCMC samples was 29,501 , so $P$ is based on approximately 30,000 samples. The $P 1, P 2$, and $P 3$ columns show the proportion of times each logical expression was true in the first third, the middle third, and the final third of the MCMC samples. In this illustration, each of these proportions is based upon approximately 10,000 MCMC samples.

Based on the proportions in the Dichotomous Estimands area of the Custom Estimands window, we can say with near certainty that the indirect effect is negative. This is consistent with the frequency polygon on p. 462 that showed no MCMC samples with an indirect effect value greater than or equal to 0 .

Similarly, the probability is about 0.975 that the indirect effect is larger (more negative) than the direct effect. The 0.975 is only an estimate of the probability. It is a proportion based on 29,501 correlated observations. However it appears to be a good estimate because the proportions from the first third ( 0.974 ), middle third ( 0.979 ) and final third ( 0.971 ) are so close together.

## Example <br> 30

## Data Imputation

## Introduction

This example demonstrates multiple imputation in a factor analysis model.

## About the Example

Example 17 showed how to fit a model using maximum likelihood when the data contain missing values. Amos can also impute values for those that are missing. In data imputation, each missing value is replaced by some numeric guess. Once each missing value has been replaced by an imputed value, the resulting completed dataset can be analyzed by data analysis methods that are designed for complete data. Amos provides three methods of data imputation.

- In regression imputation, the model is first fitted using maximum likelihood. After that, model parameters are set equal to their maximum likelihood estimates, and linear regression is used to predict the unobserved values for each case as a linear combination of the observed values for that same case. Predicted values are then plugged in for the missing values.
- Stochastic regression imputation (Little and Rubin, 2020) imputes values for each case by drawing, at random, from the conditional distribution of the missing values given the observed values, with the unknown model parameters set equal to their maximum likelihood estimates. Because of the random element in stochastic regression imputation, repeating the imputation process many times will produce a different completed dataset each time.
- Bayesian imputation is like stochastic regression imputation except that it takes into account the fact that the parameter values are only estimated and not known.


## Multiple Imputation

In multiple imputation (Schafer, 1997), a nondeterministic imputation method (either stochastic regression imputation or Bayesian imputation) is used to create multiple completed datasets. While the observed values never change, the imputed values vary from one completed dataset to the next. Once the completed datasets have been created, each completed dataset is analyzed alone. For example, if there are $m$ completed datasets, then there will be $m$ separate sets of results, each containing estimates of various quantities along with estimated standard errors. Because the $m$ completed datasets are different from each other, the $m$ sets of results will also differ from one to the next.

After each of the $m$ completed datasets has been analyzed alone, the data analyst has $m$ sets of estimates and standard errors that must be combined into a single set of results. Well-known formulas attributed to Rubin (1987) are available for combining the results from multiple completed datasets. Those formulas will be used in Example 31.

## Model-Based Imputation

In this example, imputation is performed using a factor analysis model. Model-based imputation has two advantages. First, you can impute values for any latent variables in the model. Second, if the model is correct and has positive degrees of freedom, the implied covariance matrix and implied means will be estimated more accurately than with a saturated model. (Imputation is based on the implied covariance matrix and means.) However, a saturated model like the model in Example 1 can be used for imputation when no other model is appropriate.

## Performing Multiple Data Imputation Using Amos Graphics

For this example, we will perform Bayesian multiple imputation using the confirmatory factor analysis model from Example 17. The dataset is the incomplete Holzinger and Swineford (1939) dataset in the file grant_x.sav. The imputation of missing values is only the first step in obtaining useful results from multiple imputation. Eventually, all three of the following steps need to be carried out.

- Step 1: Use the Data Imputation feature of Amos to create $m$ complete data files.
- Step 2: Perform an analysis of each of the $m$ completed data files separately.

Performing this analysis is up to you. You can perform the analysis in Amos but, typically, you would use some other program. For purposes of this example and the next, we will use SPSS Statistics to carry out a regression analysis in which one variable (sentence) is used to predict another variable (wordmean). Specifically, we will focus on the estimation of the regression weight and its standard error.

- Step 3: Combine the results from the analyses of the $m$ data files.

This example covers the first step. Steps 2 and 3 will be covered in Example 31.

- To generate the completed data files, open the Amos Graphics file Ex30.amw.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-2a4d1f3664.jpg)
- From the menus, choose Analyze > Data Imputation.

Amos displays the Amos Data Imputation window.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-e596ac7b39.jpg)

- Make sure that Bayesian imputation is selected.
- Set Number of completed datasets to 10 . (This sets $m=10$.)

You might suppose that a large number of completed data files are needed. It turns out that, in most applications, very few completed data files are needed. Five to 10 completed data files are generally sufficient to obtain accurate parameter estimates and standard errors (Rubin, 1987). There is no penalty for using more than 10 imputations except for the clerical effort involved in Steps 2 and 3.

Amos can save the completed datasets in a single file (Single output file) with the completed datasets stacked, or it can save each completed dataset in a separate file (Multiple output files). In a single-group analysis, selecting Single output file yields one output data file, whereas selecting Multiple output files yields $m$ separate data files.

In a multiple-group analysis, when you select the Single output file option, you get a separate output file for each analysis group; if you select the Multiple output files option, you get $m$ output files per group. For instance, if you had four groups and requested five completed datasets, then selecting Single output file would give you four output files, and selecting Multiple output files would give you 20. Since we are going to use

SPSS Statistics to analyze the completed datasets, the simplest thing would be to select Single output file. Then, the split file capability of SPSS Statistics could be used in Step 2 to analyze each completed dataset separately. However, to make it easy to replicate this example using any regression program:

- Select Multiple output files.

You can save imputed data in two file formats: plain text or SPSS Statistics format.

- Click File Names to display a Save As dialog.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-08e472a8a9.jpg)
- In the File name text box, you can specify a prefix name for the imputed datasets. Here, we have specified Grant_Imp.

Amos will name the imputed data files Grant_Imp1, Grant_Imp2, and so on through Grant_Imp10.

- Use the Save as type drop-down list to select plain text (.txt) or the SPSS Statistics format (.sav).
- Click Save.
- Click Options in the Data Imputation window to display the available imputation options.

The online help explains these options. To get an explanation of an option, place your mouse pointer over the option in question and press the F1 key. The figure below shows how the number of observations can be changed from 10,000 (the default) to 30,000.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-ab40e1cf20.jpg)

- Close the Options dialog and click the Impute button in the Data Imputation window. After a short time, the following message appears:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-997f5672b3.jpg)
- Click OK.

Amos lists the names of the completed data files.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f8dc6fb4cd.jpg)

Each completed data file contains 73 complete cases. Here is a view of the first few records of the first completed data file, Grant_Imp1.sav:

|  | visperc | cubes | lozenges | paragrap | sentence | wordmean | spatial | verbal | CaseNo | Imputation_ |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | 33.00 | 22.54 | 17.00 | 8.00 | 17.00 | 10.00 | -3.63 | -0.20 | 1.00 | 1.00 |
| 2 | 30.00 | 30.59 | 20.00 | 12.63 | 17.02 | 18.00 | 3.70 | -0.74 | 2.00 | 1.00 |
| 3 | 38.71 | 33.00 | 36.00 | 16.27 | 25.00 | 41.00 | 7.82 | 4.98 | 3.00 | 1.00 |
| 4 | 28.00 | 22.98 | 10.95 | 10.00 | 18.00 | 11.00 | 5.57 | -1.10 | 4.00 | 1.00 |
| 5 | 30.82 | 25.00 | 20.80 | 11.00 | 21.04 | 8.00 | 3.47 | 1.44 | 5.00 | 1.00 |
| 6 | 20.00 | 25.00 | 6.00 | 9.00 | 17.27 | 25.25 | -2.89 | -1.33 | 6.00 | 1.00 |

Here is an identical view of the second completed data file, Grant_Imp2.sav:

|  | visperc | cubes | lozenges | paragrap | sentence | wordmean | spatial | verbal | CaseNo | Imputation |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | 33.00 | 25.34 | 17.00 | 8.00 | 17.00 | 10.00 | -0.86 | -3.05 | 1.00 | 2.00 |
| 2 | 30.00 | 33.61 | 20.00 | 9.67 | 19.98 | 18.00 | 0.50 | -2.97 | 2.00 | 2.00 |
| 3 | 40.84 | 33.00 | 36.00 | 19.57 | 25.00 | 41.00 | 14.18 | 10.10 | 3.00 | 2.00 |
| 4 | 28.00 | 24.06 | 22.37 | 10.00 | 18.00 | 11.00 | -1.07 | -1.59 | 4.00 | 2.00 |
| 5 | 28.04 | 25.00 | 11.10 | 11.00 | 24.08 | 8.00 | 3.77 | 0.06 | 5.00 | 2.00 |
| 6 | 20.00 | 25.00 | 6.00 | 9.00 | 23.24 | 28.68 | -2.84 | 1.13 | 6.00 | 2.00 |

The values in the first two cases for visperc were observed in the original data file and therefore do not change across the imputed data files. By contrast, the values for these cases for cubes were missing in the original data file, Grant_x.sav, so Amos has imputed different values across the imputed data files for cubes for these two cases.

In addition to the original observed variables, Amos added four new variables to the imputed data files. Spatial and verbal are imputed latent variable scores. CaseNo and Imputation_are the case number and completed dataset number, respectively.

## Example <br> 31

## Analyzing Multiply Imputed Datasets

## Introduction

This example demonstrates the analysis of multiply (pronounced multiplee) imputed datasets.

## Analyzing the Imputed Data Files Using SPSS Statistics

Ten completed datasets were created in Example 30. That was Step 1 in a three-step process: Use the Data Imputation feature of Amos to impute $m$ complete data files. (Here, $m=10$.) The next two steps are:

- Step 2: Perform an analysis of each of the $m$ completed data files separately.
- Step 3: Combine the results from the analyses of the $m$ data files.

The analysis in Step 2 can be performed using Amos, SPSS Statistics, or any other program. Without knowing ahead of time what program will be used to analyze the completed datasets, it is not possible to automate Steps 2 and 3.

To walk through Steps 2 and 3 for a specific problem, we will analyze the completed datasets by using SPSS Statistics to carry out a regression analysis in which one variable (sentence) is used to predict another variable (wordmean). We will focus specifically on the estimation of the regression weight and its standard error.

## Step 2: Ten Separate Analyses

For each of the 10 completed datasets from Example 30, we need to perform a regression analysis in which sentence is used to predict wordmean. We start by opening the first completed dataset, Grant_Imp1.sav, in SPSS Statistics.

|  | visperc | cubes | lozenges | paragrap | sentence | wordmean | spatial | verbal | CaseNo | Imputation_ |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | 33.00 | 22.54 | 17.00 | 8.00 | 17.00 | 10.00 | -3.63 | -0.20 | 1.00 | 1.00 |
| 2 | 30.00 | 30.59 | 20.00 | 12.63 | 17.02 | 18.00 | 3.70 | -0.74 | 2.00 | 1.00 |
| 3 | 38.71 | 33.00 | 36.00 | 16.27 | 25.00 | 41.00 | 7.82 | 4.98 | 3.00 | 1.00 |
| 4 | 28.00 | 22.98 | 10.95 | 10.00 | 18.00 | 11.00 | 5.57 | -1.10 | 4.00 | 1.00 |
| 5 | 30.82 | 25.00 | 20.80 | 11.00 | 21.04 | 8.00 | 3.47 | 1.44 | 5.00 | 1.00 |
| 6 | 20.00 | 25.00 | 6.00 | 9.00 | 17.27 | 25.25 | -2.89 | -1.33 | 6.00 | 1.00 |

- From the SPSS Statistics menus, choose Analyze $>$ Regression $>$ Linear and perform the regression analysis. (We assume you do not need detailed instructions for this step.)

The results are as follows:

Coefficients ${ }^{\text {a }}$
| Model |  | Unstandardized Coefficients |  | Standardized Coefficients | t | Sig. |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
|  |  | B | Std. Error | Beta |  |  |
| 1 | (Constant) | -2.712 | 3.110 |  | -. 872 | . 386 |
|  | sentence | 1.106 | . 160 | . 634 | 6.908 | . 000 |


a. Dependent Variable: wordmean

We are going to focus on the regression weight estimate (1.106) and its estimated standard error ( 0.160 ). Repeating the analysis that was just performed for each of the other nine completed datasets gives nine more estimates for the regression weight and for its standard error. All 10 estimates and standard errors are shown in the following table:

| Imputation | ML Estimate | ML Standard Error |
| :--- | :--- | :--- |
| 1 | 1.106 | 0.160 |
| 2 | 1.080 | 0.160 |
| 3 | 1.118 | 0.151 |
| 4 | 1.273 | 0.155 |
| 5 | 1.102 | 0.154 |
| 6 | 1.286 | 0.152 |
| 7 | 1.121 | 0.139 |
| 8 | 1.283 | 0.140 |
| 9 | 1.270 | 0.156 |
| 10 | 1.081 | 0.157 |

## Step 3: Combining Results of Multiply Imputed Data Files

The standard errors from an analysis of any single completed dataset are not accurate because they do not take into account the uncertainty arising from imputing missing data values. The estimates and standard errors must be gathered from the separate analyses of the completed data files and combined into single summary values, one summary value for the parameter estimate and another summary value for the standard error of the parameter estimate. Formulas for doing this (Rubin, 1987) can be found in many places. The formulas below were taken from Schafer (1997, p. 109). The remainder of this section applies those formulas to the table of 10 estimates and 10 standard errors shown above. In what follows:

Let $m$ be the number of completed datasets ( $m=10$ in this case).
Let $\hat{Q}^{(t)}$ be the estimate from sample $t$, so $\hat{Q}^{(1)}=1.106, \hat{Q}^{(2)}=1.080$, and so on.
Let $\sqrt{U^{(t)}}$ be the estimated standard error from sample $t$, so $\sqrt{U^{(1)}}=0.160, \sqrt{U^{(2)}}=$ 0.160 , and so on.

Then the multiple-imputation estimate of the regression weight is simply the mean of the 10 estimates from the 10 completed datasets:

$$
\bar{Q}=\frac{1}{m} \sum_{t=1}^{m} \hat{Q}^{(t)}=1.172
$$

To obtain a standard error for the combined parameter estimate, go through the following steps:

- Compute the average within-imputation variance.
$\bar{U}=\frac{1}{m} \sum_{t=1}^{m} U^{(t)}=0.0233$
- Compute the between-imputation variance.
$B=\frac{1}{m-1} \sum_{t=1}^{m}\left(\hat{Q}^{(t)}-\bar{Q}\right)^{2}=0.0085$
- Compute the total variance.
$T=\bar{U}+\left(1+\frac{1}{m}\right) B=0.0233+\left(1+\frac{1}{10}\right) 0.0085=0.0326$
The multiple-group standard error is then
$\sqrt{T}=\sqrt{0.0326}=0.1807$

A test of the null hypothesis that the regression weight is 0 in the population can be based on the statistic
$\frac{\bar{Q}}{\sqrt{T}}=\frac{1.172}{0.1807}=6.49$
which, if the regression weight is 0 , has a $t$ distribution with degrees of freedom given by
$v=(m-1)\left[1+\frac{\bar{U}}{\left(1+\frac{1}{m}\right) B}\right]^{2}=(10-1)\left[1+\frac{0.0233}{\left(1+\frac{1}{10}\right) 0.0085}\right]^{2}=109$

Joseph Schafer's NORM program performs these calculations. NORM can be downloaded from http://www.stat.psu.edu/~jls/misoftwa.html\#win.

## Further Reading

Amos provides several advanced methods of handling missing data, including FIML (described in Example 17), multiple imputation, and Bayesian estimation. To learn more about each method, consult Schafer and Graham (2002) for an overview of the strengths of FIML and multiple imputation. Allison has a concise, readable monograph that covers both FIML and multiple imputation, including a number of worked examples and an excellent discussion of how to handle non-normal and categorical variables within the context of multiple imputation methods that assume multivariate normality (Allison, 2002). Schafer (1997) provides an in-depth, technical treatment of multiple imputation. Schafer and Olsen (1998) provide a readable, step-by-step guide to performing multiple imputation.

A SEM-specific study comparing the statistical performance of FIML and multiple imputation in structural equation models is also available (Olinsky, Chen, and Harlow, 2003). Lastly, it is worth noting that the Bayesian estimation approach discussed in Examples 26 through 29 is similar to FIML in its handling of missing data. Ibrahim and colleagues recently compared the performance of FIML, Bayesian estimation, probability weighting, and multiple imputation approaches to address incomplete data problems and concluded that these four approaches were generally similar in their satisfactory performance for handling incomplete data problems in which the missing data arose from a missing-at-random (MAR) process (Ibrahim, Chen, Lipsitz, and Herring, 2005). While their review considered generalized linear models rather than SEM, their results and conclusions should be generally applicable to a wide range of statistical models and data analysis scenarios, including those featuring SEM.

## Example <br> 32

## Censored Data

## Introduction

This example demonstrates parameter estimation, estimation of posterior predictive distributions, and data imputation with censored data.

## About the Data

For this example, we use the censored data from 103 patients who were accepted into the Stanford Heart Transplantation Program during the years 1967 through 1974. The data were collected by Crowley and Hu (1977) and have been reanalyzed by Kalbfleisch and Prentice (2002), among others. The dataset is saved in the file transplant-a.sav.

Example 32

|  | acceptyr | age | time | timesqr | status |
| :--- | :--- | :--- | :--- | :--- | :--- |
| 17 | 1968 | 20.33 | 35 | 5.916 | uncensored |
| 18 | 1968 | 56.85 | 42 | 6.481 | uncensored |
| 19 | 1968 | 59.12 | 36 | 6.000 | uncensored |
| 20 | 1969 | 55.28 | 27 | 5.196 | uncensored |
| 21 | 1969 | 43.34 | 1031 | 32.109 | uncensored |
| 22 | 1969 | 42.78 | 50 | 7.071 | uncensored |
| 23 | 1969 | 58.36 | 732 | 27.055 | uncensored |
| 24 | 1969 | 51.80 | 218 | 14.765 | uncensored |
| 25 | 1969 | 33.22 | 1799 | 42.415 | censored |
| 26 | 1969 | 30.54 | 1400 | 37.417 | censored |
| 27 | 1969 | 8.79 | 262 | 16.186 | uncensored |

Reading across the first visible row in the figure above, Patient 17 was accepted into the program in 1968. The patient at that time was 20.33 years old. The patient died 35 days later. The next number, 5.916, is the square root of 35 . Amos assumes that censored variables are normally distributed. The square root of survival time will be used in this example in the belief that it is probably more nearly normally distributed than is survival time itself. Uncensored simply means that we know how long the patient lived. In other words, the patient has already died, and that is how we are able to tell that he lived for 35 days after being admitted into the program.

Some patients were still alive when last seen. For example, Patient 25 entered the program in 1969 at the age of 33.22 years. The patient was last seen 1,799 days later. The number 42.415 is the square root of 1,799 . The word censored in the Status column means that the patient was still alive 1,799 days after being accepted into the program, and that is the last time the patient was seen. So, we can't say that the patient survived for 1,799 days. In fact, he survived for longer than that; we just don't know how much longer. There are more cases like that. Patient number 26 was last seen 1,400 days after acceptance into the program and, at that time, was still alive, so we know that that patient lived for at least 1,400 days.

It is not clear what is to be done with a censored value like Patient 25's survival time of 1,799 days. You can't just discard the 1,799 and all the other censored values because that amounts to discarding the patients who lived a long time. On the other hand, you can't keep the 1,799 and treat it as an ordinary score because you know the patient really lived for more than 1,799 days.

In Amos, you can use the information that Patient 25 lived for more than 1,799 days, neither discarding the information nor pretending that the patient's survival time is known more precisely than it is. Of course, wherever the data provide an exact numeric value, as in the case of Patient 24 who is known to have survived for 218 days, that exact numeric value is used.

## Recoding the Data

The data file needs to be recoded before Amos reads it. The next figure shows a portion of the dataset after recoding. (This complete dataset is in the file transplant-b.sav.)

|  | acceptyr | age | time | timesqr |
| :--- | :--- | :--- | :--- | :--- |
| 17 | 1968 | 20.33 | 35 | 5.916 |
| 18 | 1968 | 56.85 | 42 | 6.481 |
| 19 | 1968 | 59.12 | 36 | 6.000 |
| 20 | 1969 | 55.28 | 27 | 5.196 |
| 21 | 1969 | 43.34 | 1031 | 32.109 |
| 22 | 1969 | 42.78 | 50 | 7.071 |
| 23 | 1969 | 58.36 | 732 | 27.055 |
| 24 | 1969 | 51.80 | 218 | 14.765 |
| 25 | 1969 | 33.22 | > 1799 | > 42.415 |
| 26 | 1969 | 30.54 | > 1400 | > 37.417 |
| 27 | 1969 | 8.79 | 262 | 16.186 |

Every uncensored observation appears in the new data file just the way it did in the original data file. Censored values, however, are coded differently. For example, Patient 25's survival time, which is known only to be greater than 1,799 , is coded as > 1799 in the new data file. (Spaces in a string like > 1799 are optional.) The square root of survival time is known to be greater than 42.415, so the timesqr column of the data file contains $>42.415$ for Patient 25. For data file formats (like SPSS Statistics) that make a distinction between numeric and string variables, time and timesqr need to be coded as string variables.

## Analyzing the Data

To specify the data file in Amos Graphics:

- From the menus, choose File > Data Files.
- Then in the Data Files dialog, click the File Name button.
- Select the data file transplant-b.sav.
- Select Allow non-numeric data (a check mark appears next to it).
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-d1da520776.jpg)

Recoding the data as shown above and selecting Allow non-numeric data are the only extra steps that are required for analyzing censored data. In all other respects, fitting a model with censored data and interpreting the results is exactly the same as if the data were purely numeric.

## Performing a Regression Analysis

Let's try predicting timesqr using age and year of acceptance (acceptyr) as predictors. Begin by drawing the following path diagram:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-4140b62015.jpg)

To fit the model:

- Click on the toolbar.
or
- From the menus, choose Analyze > Bayesian Estimation.

Note: The button is disabled because, with non-numeric data, you can perform only Bayesian estimation.
After the Bayesian SEM window opens, wait until the unhappy face - changes into a happy face $\cdot$. The table of estimates in the Bayesian SEM window should look something like this:

|  | Mean | S.E. | S.D. | C.S. | Median | Skewness | Kurtosis | Min | Max |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| Regression weights |  |  |  |  |  |  |  |  |  |
| timesqr<--age | -0.29 | 0.00 | 0.15 | 1.00 | -0.29 | -0.05 | 0.13 | -0.94 | 0.38 |
| timesqr<--acceptyr | 1.45 | 0.00 | 0.81 | 1.00 | 1.43 | 0.10 | 0.09 | -1.55 | 4.93 |
| Means |  |  |  |  |  |  |  |  |  |
| age | 45.17 | 0.00 | 1.00 | 1.00 | 45.18 | -0.01 | 0.05 | 40.93 | 49.76 |
| acceptyr | 1970.61 | 0.00 | 0.19 | 1.00 | 1970.61 | 0.00 | 0.08 | 1969.73 | 1971.43 |

(Only a portion of the table is shown in the figure.) The Mean column contains point estimates for the parameters. The regression weight for using acceptyr to predict timesqr is 1.45 , so that each time the calendar advances by one year, you predict an increase of 1.45 in the square root of survival time. This suggests that the transplant program may have been improving over the period covered by the study. The regression weight for using age to predict timesqr is -0.29 , so for every year older a patient is when admitted into the transplant program, you expect a decrease of 0.29 in the square root of survival time. The regression weight estimate of -0.29 is actually the mean of the posterior distribution of the regression weight.

- To see the entire posterior distribution, right-click the row that contains the -0.29 estimate and choose Show Posterior from the pop-up menu.

|  | Mean | S.E. | S.D. | C.S. | Median | Skewness | Kurtosis | Min | Max |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| Regression weights |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |
| timesqr<--age | -0.29 | 0.00 | 0.15 | 1.00 | -0.29 | -0.05 | 0.13 | -0.94 | 0.38 |
| timesqr<--acceptyr <br> timesqr<--acceptyr | 1.45 | Show Posterior |  |  | 1.43 | 0.10 | 0.09 | -1.55 | 4.93 |
|  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |
| age | 45.17 | 0.00 | 1.00 | 1.00 | 45.18 | -0.01 | 0.05 | 40.93 | 49.76 |
| acceptyr | 1970.61 | 0.00 | 0.19 | 1.00 | 1970.61 | 0.00 | 0.08 | 1969.73 | 1971.43 |

The Posterior dialog opens, displaying the posterior distribution of the regression weight.

Group number 1
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-521a5c77be.jpg)

The posterior distribution of the regression weight is indeed centered around -0.29 . The distribution lies almost entirely between -0.75 and 0.25 , so it is practically guaranteed that the regression weight lies in that range. Most of the distribution lies between -0.5 and 0 , so we are pretty sure that the regression weight lies between -0.5 and 0 .

## Posterior Predictive Distributions

Recall that the dataset contains some censored values like Patient 25's survival time. All we really know about Patient 25's survival time is that it is longer than 1,799 days or, equivalently, that the square root of survival time exceeds 42.415 . Even though we do not know the amount by which this patient's timesqr exceeds 42.415, we can ask for its posterior distribution. Taking into account the fact that timesqr exceeds 42.415, assuming that the model is correct, and taking the patient's age and acceptyr into account, what can be said about Patient 25's survival time? To find out:

- Click the Posterior Predictive button 園.
or
- From the menus, choose View > Posterior Predictive.

| ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-0278b9922b.jpg) <br> Posterior <br> Predictive <br> Distributions![](https://ai-docs.amosdevelopment.com/Images/ug/ug-40330cd9cb.jpg) |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- |
|  |  |  |  |  |
|  | timesqr | age | acceptyr | $\wedge$ |
| 17 | 5.916 | 20.33127995 | 1968 |  |
| 18 | 6.481 | 56.84873374 | 1968 |  |
| 19 | 6 | 59.12388775 | 1968 | $=$ |
| 20 | 5.196 | 55.27994524 | 1969 |  |
| 21 | 32.109 | 43.34291581 | 1969 |  |
| 22 | 7.071 | 42.78439425 | 1969 |  |
| 23 | 27.055 | 58.35728953 | 1969 |  |
| 24 | 14.765 | 51.80013689 | 1969 |  |
| 25 | << | 33.2238193 | 1969 |  |
| 26 | << | 30.53524983 | 1969 |  |
| 27 | 16.186 | 8.785763176 | 1969 | ✓ |

The Posterior Predictive Distributions window shows a table with a row for every person and a column for every observed variable in the model. Looking in the $25{ }^{\text {th }}$ row, we see Patient 25's age and acceptyr scores. For Patient 25's timesqr, all we see is the symbol <<, which indicates that the data provide an inequality constraint on the timesqr score and not an actual numeric value.

To see the posterior distribution of Patient 25's timesqr:

- Click <<. The posterior distribution appears in the Posterior window.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-148dffabd5.jpg)

The posterior distribution for Patient 25's timesqr lies entirely to the right of 42.415 . Of course, we knew from the data alone that timesqr exceeds 42.415 , but now we also know that there is practically no chance that Patient 25's timesqr exceeds 70. For that matter, there is only a slim chance that timesqr exceeds even 55 .

To see a posterior predictive distribution that looks very different from Patient 25 's:

- Click the << symbol in the $100^{\text {th }}$ row of the Posterior Predictive Distributions table.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-741d127fad.jpg)

Patient 100 was still alive when last observed on the $38{ }^{\text {th }}$ day after acceptance into the program, so that his timesqr is known to exceed 6.164. The posterior distribution of that patient's timesqr shows that it is practically guaranteed to be between 6.164 and 70, and almost certain to be between, 6.164 and 50 . The mean is 27.36 , providing a point estimate of timesqr if one is needed. Squaring 27.36 gives 748, an estimate of Patient 100's survival time in days.

## Imputation

You can use this model to impute values for the censored values.

- Close the Bayesian SEM window if it is open.
- From the Amos Graphics menu, choose Analyze $>$ Data Imputation


## Amos Data Imputation

C Regression imputation
C Stochastic regression imputation

- Bayesian imputation

Number of completed datasets
10
C Multiple output files

- Single output file

| Incomplete Data Fil... | Completed Data Files |
| :--- | :--- |
| transplant-b | transplant-b_C.sav |


| Options | Help |
| :---: | :---: |
| File Names | Impute |

Notice that Regression imputation and Stochastic regression imputation are disabled. When you have non-numeric data such as censored data, Bayesian imputation is the only choice.

We will accept the options shown in the preceding figure, creating 10 completed datasets and saving all 10 in a single SPSS Statistics data file called transplantb_C.sav. To start the imputation:

- Click the Impute button.

The Bayesian SEM window opens along with the Data Imputation dialog.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-477a74388b.jpg)

- Wait until the Data Imputation dialog displays a happy face - to indicate that each of the 10 completed datasets is effectively uncorrelated with the others.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-65785e9dff.jpg)

Note: After you see a happy face but before you click OK, you may optionally choose to right-click a parameter in the Bayesian SEM window and choose Show Posterior from the pop-up menu. This will allow you to examine the Trace and Autocorrelation plots.

- Click OK in the Data Imputation dialog.

The Summary window shows a list of the completed data files that were created. In this case, only one completed data file was created.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-e27130e581.jpg)

- Double-click the file name to display the contents of the single completed data file, which contains 10 completed datasets.

Example 32

The file contains 1,030 cases because each of the 10 completed datasets contains 103 cases. The first 103 rows of the new data file contain the first completed dataset. The Imputation_variable is equal to 1 for each row in the first completed dataset, and the CaseNo variable runs from 1 through 103.

|  | timesqr | age | acceptyr | CaseNo | Imputation |
| :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | 7.00 | 30.84 | 1967.00 | 1.00 | 1.00 |
| 2 | 2.24 | 51.84 | 1968.00 | 2.00 | 1.00 |
| 3 | 3.87 | 54.30 | 1968.00 | 3.00 | 1.00 |
| 4 | 6.16 | 40.26 | 1968.00 | 4.00 | 1.00 |
| 5 | 4.12 | 20.79 | 1968.00 | 5.00 | 1.00 |
| 6 | 1.41 | 54.60 | 1968.00 | 6.00 | 1.00 |
| 7 | 25.96 | 50.87 | 1968.00 | 7.00 | 1.00 |
| 8 | 6.25 | 45.35 | 1968.00 | 8.00 | 1.00 |
| 9 | 9.16 | 47.16 | 1968.00 | 9.00 | 1.00 |
| 10 | 7.55 | 42.50 | 1968.00 | 10.00 | 1.00 |
| 11 | 12.33 | 47.98 | 1968.00 | 11.00 | 1.00 |
| 12 | 2.65 | 53.19 | 1968.00 | 12.00 | 1.00 |
| 13 | 8.94 | 54.57 | 1968.00 | 13.00 | 1.00 |
| 14 | 37.23 | 54.01 | 1968.00 | 14.00 | 1.00 |
| 15 | 0.00 | 53.82 | 1968.00 | 15.00 | 1.00 |
| 16 | 17.52 | 49.45 | 1968.00 | 16.00 | 1.00 |
| 17 | 5.92 | 20.33 | 1968.00 | 17.00 | 1.00 |
| 18 | 6.48 | 56.85 | 1968.00 | 18.00 | 1.00 |
| 19 | 6.00 | 59.12 | 1968.00 | 19.00 | 1.00 |
| 20 | 5.20 | 55.28 | 1969.00 | 20.00 | 1.00 |
| 21 | 32.11 | 43.34 | 1969.00 | 21.00 | 1.00 |
| 22 | 7.07 | 42.78 | 1969.00 | 22.00 | 1.00 |
| 23 | 27.06 | 58.36 | 1969.00 | 23.00 | 1.00 |
| 24 | 14.77 | 51.80 | 1969.00 | 24.00 | 1.00 |
| 25 | 49.66 | 33.22 | 1969.00 | 25.00 | 1.00 |
| 26 | 41.67 | 30.54 | 1969.00 | 26.00 | 1.00 |

The first row of the completed data file contains a timesqr value of 7. Because that was not a censored value, 7 is not an imputed value. It is just an ordinary numeric value that was present in the original data file. On the other hand, Patient 25's timesqr was censored, so that patient has an imputed timesqr (in this case, 49.66.) The value of 49.66 is a value drawn randomly from the posterior predictive distribution in the figure on p. 499.

Normally, the next step would be to use the 10 completed datasets in transplantb_C.sav as input to some other program that cannot accept censored data. You would use that other program to perform 10 separate analyses, using each one of the 10 completed datasets in turn. Then you would do further computations to combine the results of those 10 separate analyses into a single set of results, as was done in Example 31. Those steps will not be carried out here.

## General Inequality Constraints on Data Values

This example employed only inequality constraints like $>1799$. Here are some other examples of string values that can be used in a data file to place inequality constraints on the value of an underlying numeric variable:

- The string value $<5$ means that the underlying numeric value is less than 5 .
- The string value $4 \ll 5$ means that the underlying numeric value is between 4 and 5 .


## Ordered-Categorical Data

## Introduction

This example shows how to fit a factor analysis model to ordered-categorical data. It also shows how to find the posterior predictive distribution for the numeric variable that underlies a categorical response and how to impute a numeric value for a categorical response.

## About the Data

This example uses data on attitudes toward environment issues obtained from a questionnaire administered to 1,017 respondents in the Netherlands. The data come from the European Values Study Group (see the bibliography for a citation). The data file environment-nl-string.sav contains responses to six questionnaire items with categorical responses strongly disagree (SD), disagree ( $D$ ), agree ( $A$ ), and strongly agree ( $S A$ ).

Example 33

|  | item1 | item2 | item3 | item4 | item5 | item6 |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | A |  | SA | SD | A | A |
| 2 | A |  | A | SA | SA | SA |
| 3 |  | A | A | A | A | A |
| 4 | A | A | A |  |  |  |
| 5 | D | SD |  |  | D |  |
| 6 | SA | SA | A |  | A | A |
| 7 | A | D |  | A | A | A |
| 8 | D | D |  | SD |  | SD |
| 9 | SA | SA | SA | A |  | A |
| 10 | SA | A | A | SA | SA |  |
| 11 | A | A | A | A |  | A |
| 12 | SA | SA | A | A |  | A |

One way to analyze these data is to assign numbers to the four categorical responses; for example, using the assignment $1=S D, 2=D, 3=A, 4=S A$. If you assign numbers to categories in that way, you get the dataset in environment-nl-numeric.sav.

|  | item1 | item2 | item3 | item4 | item5 | item6 |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | 3 |  | 4 | 1 | 3 | 3 |
| 2 | 3 |  | 3 | 4 | 4 | 4 |
| 3 | - | 3 | 3 | 3 | 3 | 3 |
| 4 | 3 | 3 | 3 | - | - | - |
| 5 | 2 | 1 | - | - | 2 | - |
| 6 | 4 | 4 | 3 | - | 3 | 3 |
| 7 | 3 | 2 | - | 3 | 3 | 3 |
| 8 | 2 | 2 | - | 1 | - | 1 |
| 9 | 4 | 4 | 4 | 3 | - | 3 |
| 10 | 4 | 3 | 3 | 4 | 4 | - |
| 11 | 3 | 3 | 3 | 3 | - | 3 |
| 12 | 4 | 4 | 3 | 3 | - | 3 |

In an Amos analysis, it is not necessary to assign numbers to categories in the way just shown. It is possible to use only the ordinal properties of the four categorical responses. If you want to use only the ordinal properties of the data, you can use either dataset, environment-nl-string.sav or environment-nl-numeric.sav.

It may be slightly easier to use environment-nl-numeric.sav because Amos will assume by default that the numbered categories go in the order $1,2,3,4$, with 1 being the lowest category. That happens to be the correct order. With environment-nlstring.sav, by contrast, Amos will assume by default that the categories are arranged alphabetically in the order A, D, SA, SD, with A being the lowest category. That is the wrong order, so the default ordering of the categories by Amos has to be overridden.

The data file environment-nl-string.sav will be used for this example because then it will be clear that only the ordinal properties of the data are employed, and also you can see how to specify the correct ordering of the categories.

## Specifying the Data File

- From the Amos Graphics menus, choose File > Data Files.
- In the Data Files window, click the File Name button.
- Select the data file environment-nl-string.sav.
- Select Allow non-numeric data (a check mark appears next to it).
- Click OK.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-6a101a33f3.jpg)


## Recoding the Data within Amos

The ordinal properties of the data cannot be inferred from the data file alone. To give Amos the additional information it needs so that it can interpret the data values $S D, D$, $A$, and $S A$ :

- From the Amos Graphics menus, choose Tools > Data Recode.
- Select item1 in the list of variables in the upper-left corner of the Data Recode window. This displays a frequency distribution of the responses to item1 at the bottom of the window.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-4b960cec9d.jpg)

In the box labeled Recoding rule, the notation No recoding means that Amos will read the responses to item 1 as is. In other words, it will read either $S D, D, A, S A$, or an empty string. We can't leave things that way because Amos doesn't know what to do with $S D$, $D$, and so on.

- Click No recoding and select Ordered-categorical from the drop-down list.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-0828ddc190.jpg)

Example 33
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-0fbc3042f5.jpg)

The frequency table at the bottom of the window now has a New Value column that shows how the item1 values in the data file will be recoded before Amos reads the data. The first row of the frequency table shows that empty strings in the original data file will be treated as missing values. The second row shows that the $A$ response will be translated into the string <0.0783345405060296. Amos will interpret this to mean that there is a continuous numeric variable that underlies responses to item1, and that a person who gives the $A$ response has a score that is less than 0.0783345405060296 on that underlying variable. Similarly, the third row shows that the $D$ response will be translated into the string $0.0783345405060296 \ll 0.442569286522029$ and interpreted by Amos to mean that the score on the underlying numeric variable is between 0.0783345405060296 and 0.442569286522029 . The numbers, 0.0783345405060296 , 0.442569286522029 , and so on, are derived from the frequencies in the Frequency column, based on the assumption that scores on the underlying numeric variable are normally distributed with a mean of 0 and a standard deviation of 1 .

The ordering of the categories in the Original Value column needs to be changed. To change the ordering:

- Click the Details button. The Ordered-Categorical Details dialog opens.

Unordered categories

Ordered-Categorical Details
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-031ec2459c.jpg)

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-b055db4ebe.jpg)

Ordered categories
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-3cd6d72ac5.jpg)

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-8a467d6896.jpg)

The Ordered categories list box shows four response categories arranged in the order $A, D, S A, S D$, and separated from each other by dashed lines, <---->. The dashed lines represent three boundaries that divide the real numbers into four intervals, with the four intervals being associated with the four categorical responses. The assumption is made that a person who scores below the lowest boundary on some unobserved numeric variable gives the $A$ response. A person who scores between the lowest boundary and the middle boundary gives the $D$ response. A person who scores between the middle boundary and the highest boundary gives the $S A$ response. Finally, a person who scores above the highest boundary gives the $S D$ response.

The program is correct about there being four categories (intervals) and three boundaries, but it has the ordering of the categories wrong. The program arbitrarily alphabetized the categories. We need to keep the four categories and the three boundaries but rearrange them. We want $S D$ to fall in the lowest interval (below the lowest boundary), and so on.

Example 33

You can rearrange the categories and the boundaries. To do this:

- Drag and drop with the mouse.
or
- Select a category or boundary with the mouse and then click the Up or Down button.

After putting the categories and boundaries in the correct order, the OrderedCategorical Details dialog looks like this:

Unordered categories

Ordered-Categorical Details
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-203e431d20.jpg)

The Unordered categories list box contains a list of values that Amos will treat as missing. At the moment, the list contains one entry, [empty string], so that Amos will treat an empty string as a missing value. If a response coded as an empty string was actually a response that could be meaningfully compared to $S D, D, A$, and $S A$, then you would select [empty string] in the Unordered categories list box and click the Down button to move [empty string] into the Ordered categories list box.

Similarly, if a response in the Ordered categories list box, for example $S D$, was not comparable to the other responses, you would select it with the mouse and click the Up button to move it into the Unordered categories list box. Then $S D$ would be treated as a missing value.

Note: You can't drag and drop between the Ordered categories list box and the Unordered categories list box. You have to use the Up and Down buttons to move a category from one box to the other.

We could stop here and close the Ordered-Categorical Details dialog because we have the right number of boundaries and categories and we have the categories going in the right order. However, we will make a further change based on a suggestion by Croon (2002), who also worked with this dataset and concluded that the SD category occurred so seldom that it should be combined with the $D$ category. To merge those two categories into a single category:

- Select the boundary between the two categories you want to merge.
- Click the Remove Boundary button. The Ordered categories list now looks like this:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-59d0698869.jpg)

Now the $S D$ response and the $D$ response are indistinguishable. Either response means that the person who gave the response has a score that lies in the lowest interval on the underlying numeric variable.

Example 33

There remains the question of the values of the two boundaries that separate the three intervals. If you do not specify values for the boundaries, Amos will estimate the boundaries by assuming that scores on the underlying numeric variable are normally distributed with a mean of 0 and a standard deviation of 1 . Alternatively, you can assign a value to a boundary instead of letting Amos estimate it. To assign a value:

- Select the boundary with the mouse.
- Type a numeric value in the text box.

The following figure shows the result of assigning values 0 and 1 to the two boundaries.

Unordered categories

Ordered-Categorical Details
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-6b22b90bfb.jpg)

$\frac{\text { Up }}{\text { Down }}$

Ordered categories
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f537913631.jpg)

Although it may not be obvious, it is permissible to assign 0 and 1 , or any pair of numbers, to the two boundaries, as long as the higher boundary is assigned a larger value than the lower one. No matter how many boundaries there are (as long as there are at least two), assigning values to two of the boundaries amounts to choosing a zero point and a unit of measurement for the underlying numeric variable. The scaling of the underlying numeric variable is discussed further in the Help file under the topic "Choosing boundaries when there are three categories."

- Click OK to close the Ordered-Categorical Details dialog.

The changes that were just made to the categories and the interval boundaries are now reflected in the frequency table at the bottom of the Data Recode window.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-01036de2a8.jpg)

The frequency table shows how the values that appear in the data file will be recoded before Amos reads them. Reading the frequency table from top to bottom:

- An empty string will be treated as a missing value.
- The strings $S D$ and $D$ will be recoded as $<0$, meaning that the underlying numeric score is less than 0 .
- $A$ will be recoded as $0 \ll 1$, meaning that the underlying numeric score is between 0 and 1.
- $S A$ will be recoded as $>1$, meaning that the underlying numeric score is greater than 1 .

Example 33

That takes care of item 1. What was just done for iteml has to be repeated for each of the five remaining observed variables. After specifying the recoding for all six observed variables, you can view the original dataset along with the recoded variables. To do this:

- Click the View Data button.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f0df91e9d0.jpg)

The table on the left shows the contents of the original data file before recoding. The table on the right shows the recoded variables after recoding. When Amos performs an analysis, it reads the recoded values, not the original values.

Note: You can create a raw data file in which the data recoding has already been performed. In other words, you can create a raw data file that contains the inequalities on the right-hand side of the figure above. In that case, you wouldn't need to use the Data Recode window in Amos. Indeed, that approach was used in Chapter 32.

- Finally, close the Data Recode window before specifying the model.


## Specifying the Model

After you have specified the rules for data recoding as shown above, the analysis proceeds just like any Bayesian analysis. For this example, a factor analysis model will be fitted to the six questionnaire items in the environment dataset. The first three items were designed to be measures of willingness to spend money to take care of the environment. The other three items were designed to be measures of awareness of environmental issues. This design of the questionnaire is reflected in the following factor analysis model, which is saved in the file Ex33-a.amw.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-afeeb34191.jpg)

The path diagram is drawn exactly as it would be drawn for numeric data. This is one of the good things about having at least three categories for each ordered-categorical variable: You can specify a model in the way that you are used to, just as though all the variables were numeric, and the model will work for any combination of numeric and ordered-categorical variables. If variables are dichotomous, you will need to impose additional parameter constraints in order to make the model identified. This issue is discussed further in the online help under the topic "Parameter identification with dichotomous variables."

## Fitting the Model

- Click on the toolbar.
or
- From the menus, choose Analyze > Bayesian Estimation.

Note: The button is disabled because, with non-numeric data, you can perform only Bayesian estimation.

After the Bayesian SEM window opens, wait until the unhappy face changes into a happy face. The Bayesian SEM window should then look something like this:

|  | Mean | S.E. | S.D. | C.S. | Median | Skewness | Kurtosis | Min | Max | Name |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| Regression weights |  |  |  |  |  |  |  |  |  |  |
| item1<--WILLING | 0.59 | 0.00 | 0.03 | 1.00 | 0.59 | 0.09 | -0.01 | 0.47 | 0.71 |  |
| item2<--WILLING | 0.61 | 0.00 | 0.03 | 1.00 | 0.61 | 0.11 | 0.02 | 0.48 | 0.74 |  |
| item3<--WILLING | 0.41 | 0.00 | 0.02 | 1.00 | 0.41 | 0.06 | 0.03 | 0.32 | 0.52 |  |
| item4<--AWARE | 0.56 | 0.00 | 0.03 | 1.00 | 0.56 | 0.11 | 0.03 | 0.43 | 0.70 |  |
| item5<--AWARE | 0.41 | 0.00 | 0.03 | 1.00 | 0.40 | 0.09 | -0.02 | 0.30 | 0.52 |  |
| item6<--AWARE | 0.55 | 0.00 | 0.03 | 1.00 | 0.55 | 0.08 | 0.02 | 0.43 | 0.68 |  |
| Intercepts |  |  |  |  |  |  |  |  |  |  |
| item1 | 0.62 | 0.00 | 0.02 | 1.00 | 0.62 | 0.02 | 0.04 | 0.52 | 0.72 |  |
| item2 | 0.35 | 0.00 | 0.03 | 1.00 | 0.35 | -0.01 | 0.01 | 0.25 | 0.45 |  |
| item3 | 0.52 | 0.00 | 0.02 | 1.00 | 0.52 | 0.00 | -0.01 | 0.43 | 0.61 |  |
| item6 | 0.62 | 0.00 | 0.02 | 1.00 | 0.62 | 0.02 | 0.08 | 0.53 | 0.72 |  |
| item4 | 0.35 | 0.00 | 0.03 | 1.00 | 0.35 | -0.07 | 0.10 | 0.23 | 0.47 |  |
| item5 | 0.48 | 0.00 | 0.02 | 1.00 | 0.48 | -0.02 | -0.03 | 0.39 | 0.57 |  |
| Covariances |  |  |  |  |  |  |  |  |  |  |
| AWARE<->WILLING | 0.55 | 0.00 | 0.04 | 1.00 | 0.56 | -0.11 | 0.04 | 0.39 | 0.69 |  |

(The figure above shows some, but not all, of the parameter estimates.) The Mean column provides a point estimate for each parameter. For example, the regression weight for using WILLING to predict iteml is estimated to be 0.59 . The skewness (0.09) and kurtosis (-0.01) of the posterior distribution are close to 0 , which is
compatible with the posterior distribution being nearly normal. The standard deviation (S.D.) is 0.03 , so there is about a $67 %$ chance that the regression weight is within 0.03 of 0.59 . Doubling the standard deviation gives 0.06 , so there is about a $95 %$ chance that the regression weight is within 0.06 of 0.59 .

To view the posterior distribution of the regression weight:

- Right-click its row and choose Show Posterior from the pop-up menu.

|  | Mean | S.E. | S.D. | C.S. | Median | Skewness | Kurtosis | Min | Max | Name |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| Regression weights |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |
| item1<-WILLING | 0.59 | 0.00 | 0.03 | 1.00 | neal | non | -0.01 | 0.47 | 0.71 |  |
| item2<--WILLING | 0.61 | 0.00 | 0.03 | 1.00 | Show Posterior <br> Show Prior |  | 0.02 | 0.48 | 0.74 |  |
| item3<--WILLING | 0.41 | 0.00 | 0.02 | 1.00 |  |  | 0.03 | 0.32 | 0.52 |  |
| item4<--AWARE | 0.56 | 0.00 | 0.03 | 1.00 | 0.56 | 0.11 | 0.03 | 0.43 | 0.70 |  |

The Posterior window displays the posterior distribution. The appearance of the distribution confirms what was concluded above from the mean, standard deviation, skewness, and kurtosis of the distribution. The shape of the distribution is nearly normal, and it looks like roughly 95% of the area lies between 0.53 and 0.65 (that is, within 0.06 of 0.59 ).
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-5b8093591a.jpg)

## MCMC Diagnostics

If you know how to interpret the diagnostic output from MCMC algorithms (for example, see Gelman et al, 2013), you might want to view the Trace plot and the Autocorrelation plot.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-e0e6a7901b.jpg)

## File Edit View Help

Group number 1, item1<--WILLING
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-3f1c624883.jpg)

C Polygon
Shaded

Histogram
First and last

Trace
Autocorrelation

The First and last plot provides another diagnostic. It shows two estimates of the posterior distribution (two superimposed plots), one estimate from the first third of the MCMC sample and another estimate from the last third of the MCMC sample.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-c8947b4e83.jpg)

## Posterior Predictive Distributions

When you think of estimation, you normally think of estimating model parameters or some function of the model parameters such as a standardized regression weight or an indirect effect. However, there are other unknown quantities in the present analysis. Each entry in the data table on p. 508 represents a numeric value that is either unknown or partially known. For example, Person 1 did not respond to item2, so we can only guess at (estimate) that person's score on the underlying numeric variable. On the other hand, it seems like we ought to be able to make a fairly educated guess about the underlying numeric value, considering that we know how the person responded to the other items, and that we can also make use of the assumption that the model is correct.

We are in an even better position to guess at Person 1's score on the numeric variable that underlies item 1 because Person 1 gave a response to item1. This person's response places his or her score in the middle interval, between the two boundaries. Since the two boundaries were arbitrarily fixed at 0 and 1 , we know that the score is somewhere between 0 and 1 , but it seems like we should be able to say more than that by using the person's responses on the other variables along with the assumption that the model is correct.

In Bayesian estimation, all unknown quantities are treated in the same way. Just as unknown parameter values are estimated by giving their posterior distribution, so are unknown data values. A posterior distribution for an unknown data value is called a posterior predictive distribution, but it is interpreted just like any posterior distribution. To view posterior predictive distributions for unknown data values:

- Click the Posterior Predictive button 園.
or
- From the menus, choose View > Posterior Predictive.

The Posterior Predictive Distributions window appears.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-9eb2e09562.jpg)

The Posterior Predictive Distributions window contains a table with a row for every person and a column for every observed variable in the model. An asterisk (*) indicates a missing value, while << indicates a response that places inequality constraints on the underlying numeric variable. To display the posterior distribution for an item:

- Click on the table entry in the upper-left corner (Person 1's response to item1).

The Posterior window opens, displaying the posterior distribution of Person 1's underlying numeric score. At first, the posterior distribution looks jagged and random.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-83cb302c5e.jpg)

That is because the program is building up an estimate of the posterior distribution as MCMC sampling proceeds. The longer you wait, the better the estimate of the posterior distribution will be. After a while, the estimate of the posterior distribution stabilizes and looks something like this:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-71c57ed86a.jpg)

The posterior distribution shows that Person 1's score on the numeric variable that underlies his or her response to item 1 is between 0 and 1 (which we knew already), and that the score is more likely to be close to 1 than close to 0 .

- Next, click the table entry in the first column of the $22^{\text {nd }}$ row to estimate Person 22's score on the numeric variable that underlies his or her response to item1.

After you wait a while to get a good estimate of the posterior distribution, you see this:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-61c946f34f.jpg)

Both Person 1 and Person 22 gave the agree response to item 1, so both people have scores between 0 and 1 on the underlying numeric variable; however, their posterior distributions are very different

Example 33

For another example of a posterior predictive distribution, select a missing value like Person 1's response to item2. After allowing MCMC sampling to proceed long enough to get a good estimate of the posterior distribution, it looks like this:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-9f79a61230.jpg)

The mean of the posterior distribution (0.52) can be taken as an estimate of Person 1's score on the underlying variable if a point estimate is required. Looking at the plot of the posterior distribution, we can be nearly $100 %$ sure that the score is between -1 and 2. The score is probably between 0 and 1 because most of the area under the posterior distribution lies between 0 and 1 .

## Posterior Predictive Distributions for Latent Variables

Suppose you want to estimate Person 1's score on the WILLING factor. Amos can estimate posterior predictive distributions for unknown scores only for observed variables. It cannot estimate a posterior predictive distribution of a score on a latent variable. However, there is a trick that you can use to estimate the posterior predictive distribution of a score on WILLING. You can change WILLING to an observed variable, treating it not as a latent variable but as an observed variable that has a missing value for every case. That requires two changes - a change to the path diagram and a change to the data.

In the path diagram, the WILLING ellipse has to be changed into a rectangle. To accomplish this:

- Right-click the WILLING ellipse and choose Toggle Observed/Unobserved from the popup menu.
- Click the WILLING ellipse.

The WILLING ellipse changes to a rectangle so that the path diagram looks like this:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-35211d9cbd.jpg)

That takes care of the path diagram. It is also necessary to make a change to the data because if WILLING is an observed variable, then there has to be a WILLING column in the data file. You can directly modify the data file. Since this is a data file in SPSS Statistics format, you would use SPSS Statistics to add a WILLING variable to the data file, making sure that all the scores on WILLING are missing.

To avoid changing the original data file:

- Right-click the WILLING variable in the path diagram
- Choose Data Recode from the pop-up menu to open the Data Recode window.

In the Data Recode window, click Create Variable. A new variable with the default name V1, appears in the New and recoded variables list box.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-8fa8746979.jpg)
Environment-nl-string.sav - Data Recode

Original variables
| Name | Label |  |
| :--- | :--- | :--- |
| $\mathbf{A}$ | item1 |  |
| $\mathbf{A}$ | item2 |  |
| $\mathbf{A}$ | item3 |  |
| $\mathbf{A}$ | item4 |  |
| $\mathbf{A}$ | item5 |  |
| $\mathbf{A}$ | item6 |  |


New and recoded variables
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-def4e1d05c.jpg)

Create Variable

> Delete Variable

Rename Variable

Recoding rule
□
□
$\_\_\_\_$

- Change V1 to WILLING. (If necessary, click the Rename Variable button.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-cdcfc0345e.jpg)
- You can optionally view the recoded dataset that includes the new WILLING variable by clicking the View Data button.

总 Data
| Original variables |  |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
|  | item1 | item2 | item3 | item4 | item5 | item6 | へ |
| 1 | A |  | SA | SD | A | A |  |
| 2 | A |  | A | SA | SA | SA |  |
| 3 |  | A | A | A | A | A |  |
| 4 | A | A | A |  |  |  |  |
| 5 | D | SD |  |  | D |  |  |
| 6 | SA | SA | A |  | A | A |  |
| 7 | A | D |  | A | A | A |  |
| 8 | D | D |  | SD |  | SD |  |
| 9 | SA | SA | SA | A |  | A |  |
| 10 | SA | A | A | SA | SA |  | v |


New and recoded variables
|  | item1 | item2 | item3 | item4 | item5 | item6 | WILLING |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | $0<:<1$ | * | $>1$ | <0 | $0<:<1$ | 0<:<1 | * |
| 2 | 0<<1 | * | 0<:<1 | >1 | $>1$ | $>1$ | * |
| 3 | * | 0<<1 | 0<<1 | $0 \ll 1$ | $0 \ll 1$ | $0 \ll 1$ | * |
| 4 | $0 \ll 1$ | 0<<1 | $0 \ll 1$ | * | * | * | * |
| 5 | <0 | <0 | * | * | <0 | * | * |
| 6 | >1 | >1 | $0<:<1$ | * | 0<<1 | 0<:<1 | * |
| 7 | $0 \ll 1$ | <0 | * | $0 \ll 1$ | $0 \ll 1$ | $0 \ll 1$ | * |
| 8 | <0 | <0 | * | <0 | * | <0 | * |
| 9 | >1 | >1 | >1 | 0<<1 | * | 0<:<1 | * |
| 10 | >1 | 0<:<1 | 0<:<1 | >1 | >1 | * | * |


The table on the left shows the original dataset. The table on the right shows the recoded dataset as read by Amos. It includes item1 through item6 after recoding, and also the new WILLING variable.

- Close the Data Recode window.
- Start the Bayesian analysis by clicking on the Amos Graphics toolbar.
- In the Rayesian SEM window, wait until the unhappy face - changes into a happy face $\ddot{-}$ and then click the Posterior Predictive button 挤.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-516660cb37.jpg)

- Click the entry in the upper-right corner of the table to display the posterior distribution of Person 1's score on the WILLING factor.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-cb72744b99.jpg)


## Imputation

Data imputation works the same way for ordered-categorical data as it does for numeric data. With ordered-categorical data, you can impute numeric values for missing values, for scores on latent variables, and for scores on the unobserved numeric variables that underlie observed ordered-categorical measurements.

You need a model in order to perform imputation. You could use the factor analysis model that was used earlier. There are several advantages and one disadvantage to using the factor analysis model for imputation. One advantage is that, if the model is correct, you can impute values for the factors. That is, you can create a new data set in which WILLING and AWARE are observed variables. The other advantage is that, if the factor analysis model is correct, it can be expected to give more accurate imputations for item1 through item6 than would be obtained from a less restrictive model. The disadvantage of using the factor analysis model is that it may be wrong. To be on the safe side, the present example will use the model that has the biggest chance of being correct, the saturated model shown in the following figure. (See the file Ex33-c.amw.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-4f37dc2d40.jpg)

After drawing the path diagram for the saturated model, you can begin the imputation.

- From the Amos Graphics menu, choose Analyze $>$ Data Imputation.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-1378de33c6.jpg)

In the Amos Data Imputation window, notice that Regression imputation and Stochastic regression imputation are disabled. When you have non-numeric data, Bayesian imputation is the only choice.

We will accept the options shown in the preceding figure, creating 10 completed datasets and saving all 10 in a single SPSS Statistics data file called environment-nlstring_C.sav. To start the imputation:

- Click the Impute button.

The Bayesian SEM window opens along with the Data Imputation dialog box.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-b92b4f1653.jpg)

- Wait until the Data Imputation dialog box displays a happy face to indicate that each of the 10 completed data sets is effectively uncorrelated with the others.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-fe0ca15a3a.jpg)

Note: After you see a happy face but before you click OK, you may optionally rightclick a parameter in the Bayesian SEM window and choose Show Posterior from the pop-up menu. This will allow you to examine the Trace and Autocorrelation plots.

Example 33

- Click OK in the Data Imputation dialog box.

The Summary window shows a list of the completed data files that were created. In this case, only one completed data file was created.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-324acb0687.jpg)

- Double-click the file name in the Summary window to display the contents of the single completed data file, which contains 10 completed data sets.

The file contains 10,170 cases because each of the 10 completed datasets contains 1,017 cases. The first 1,017 rows of the new data file contain the first completed dataset. The Imputation_variable is equal to 1 for each row in the first completed dataset, and the CaseNo variable runs from 1 through 1,017 before starting over again at 1 .

|  | item1 | item2 | item3 | item4 | item5 | item6 | CaseNo | Imputation_ |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | 0.82 | 0.57 | 1.01 | -0.45 | 0.78 | 0.10 | 1.00 | 1.00 |
| 2 | 0.64 | -0.25 | 0.30 | 1.22 | 1.56 | 1.91 | 2.00 | 1.00 |
| 3 | 1.32 | 0.61 | 0.53 | 0.35 | 0.17 | 0.74 | 3.00 | 1.00 |
| 4 | 0.00 | 0.39 | 0.79 | 1.50 | 0.84 | 1.73 | 4.00 | 1.00 |
| 5 | -0.32 | -0.69 | -0.46 | -0.90 | -0.47 | 0.13 | 5.00 | 1.00 |
| 6 | 1.63 | 1.26 | 0.61 | 0.73 | 0.74 | 0.44 | 6.00 | 1.00 |
| 7 | 0.75 | -0.13 | 0.61 | 0.25 | 0.41 | 0.78 | 7.00 | 1.00 |
| 8 | -0.98 | -0.09 | 0.13 | -0.63 | 0.52 | -0.12 | 8.00 | 1.00 |
| 9 | 2.69 | 2.45 | 1.22 | 0.34 | 0.99 | 0.95 | 9.00 | 1.00 |
| 10 | 1.35 | 0.10 | 0.78 | 1.55 | 1.03 | 1.29 | 10.00 | 1.00 |
| 11 | 0.18 | 0.37 | 0.78 | 0.24 | 0.53 | 0.95 | 11.00 | 1.00 |
| 12 | 1.34 | 1.05 | 0.29 | 0.05 | 0.53 | 0.82 | 12.00 | 1.00 |

Normally, the next step would be to use the 10 completed datasets in environment-nlstring_C.sav as input to some other program that requires numeric (not orderedcategorical) data. You would use that other program to perform 10 separate analyses using each one of the 10 completed data sets in turn. Then, you would do further computations to combine the results of those 10 separate analyses into a single set of results, as was done in Example 31. Those steps will not be carried out here.

Example 33

## Mixture Modeling with Training Data

## Introduction

Mixture modeling is appropriate when you have a model that is incorrect for an entire population, but where the population can be divided into subgroups in such a way that the model is correct in each subgroup.

Mixture modeling is discussed in the context of structural equation modeling by Arminger, Stein, and Wittenberg (1999), Hoshino (2001), Lee (2007, Chapter 11), Loken (2004), Vermunt and Magidson (2005), and Zhu and Lee (2001), among others.

The present example demonstrates mixture modeling for the situation in which some cases have already been assigned to groups while other cases have not. It is up to Amos to learn from the cases that are already classified and to classify the others.

We begin mixture modeling with an example in which some cases have already been classified because setting up such an analysis is almost identical to setting up an ordinary multiple-group analysis such as in Examples 10, 11, and 12.

It is possible to perform mixture modeling when no cases have been classified in advance so that the program must classify every case. Example 35 demonstrates this type of analysis.

## About the Data

The data for this example were collected by Anderson (1935) and used by Fisher (1936) to demonstrate discriminant analysis. The original data are in the file iris.sav, of which a portion is shown here:

|  | SepalLength | SepalWidth | PetalLength | PetalWidth | Species |
| :--- | :--- | :--- | :--- | :--- | :--- |
| 49 | 5.3 | 3.7 | 1.5 | 2 | setosa |
| 50 | 5.0 | 3.3 | 1.4 | 2 | setosa |
| 51 | 7.0 | 3.2 | 4.7 | 1.4 | versicolor |
| 52 | 6.4 | 3.2 | 4.5 | 1.5 | versicolor |
| 53 | 6.9 | 3.1 | 4.9 | 1.5 | versicolor |
| 54 | 5.5 | 2.3 | 4.0 | 1.3 | versicolor |
| 55 | 6.5 | 2.8 | 4.6 | 1.5 | versicolor |
| 56 | 5.7 | 2.8 | 4.5 | 1.3 | versicolor |
| 57 | 6.3 | 3.3 | 4.7 | 1.6 | versicolor |
| 58 | 4.9 | 2.4 | 3.3 | 1.0 | versicolor |
| 59 | 6.6 | 2.9 | 4.6 | 1.3 | versicolor |
| 60 | 5.2 | 2.7 | 3.9 | 1.4 | versicolor |
| 61 | 5.0 | 2.0 | 3.5 | 1.0 | versicolor |
| 62 | 5.9 | 3.0 | 4.2 | 1.5 | versicolor |

The dataset contains four measurements on flowers from 150 different plants. The first 50 flowers were irises of the species setosa. The next 50 were irises of the species versicolor. The last 50 were of the species virginica.

A scatterplot of two of the numeric measurements, PetalLength and PetalWidth, suggests that those two measurements alone will be useful in classifying the flowers according to species.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-abb1c4f907.jpg)

The setosa flowers are all by themselves in the lower left corner of the scatterplot. It should therefore be easy for Amos to use PetalLength and PetalWidth to distinguish the setosa flowers from the others. On the other hand, there is some overlap of versicolor and virginica, so we should expect that sometimes it will be hard to tell whether a flower is versicolor or virginica purely on the basis of PetalLength and PetalWidth.

Example 34

This example will not use the iris.sav dataset, which gives the species of every flower. Instead, the example will use the iris3.sav dataset, which gives the species for only a few flowers. The following figure shows a portion of the iris3.sav dataset.

|  | SepalLength | SepalWidth | PetalLength | PetalWidth | Species |
| :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | 5.1 | 3.5 | 1.4 | . 2 | setosa |
| 2 | 4.9 | 3.0 | 1.4 | . 2 | setosa |
| 3 | 4.7 | 3.2 | 1.3 | . 2 | setosa |
| 4 | 4.6 | 3.1 | 1.5 | . 2 | setosa |
| 5 | 5.0 | 3.6 | 1.4 | . 2 | setosa |
| 6 | 5.4 | 3.9 | 1.7 | . 4 | setosa |
| 7 | 4.6 | 3.4 | 1.4 | . 3 | setosa |
| 8 | 5.0 | 3.4 | 1.5 | . 2 | setosa |
| 9 | 4.4 | 2.9 | 1.4 | . 2 | setosa |
| 10 | 4.9 | 3.1 | 1.5 | . 1 | setosa |
| 11 | 5.4 | 3.7 | 1.5 | . 2 |  |
| 12 | 4.8 | 3.4 | 1.6 | . 2 |  |
| 13 | 4.8 | 3.0 | 1.4 | . 1 |  |
| 14 | 4.3 | 3.0 | 1.1 | . 1 |  |

Species information is available for 10 of the setosa flowers, 10 of the versicolor flowers, and 10 of the virginica flowers. Species is unknown for the remaining 120 flowers. When Amos analyzes these data, it will have 10 examples of each kind of flower to assist in classifying the rest of the flowers.

## Performing the Analysis

- From the menus, choose File > New to start a new path diagram.
- From the menus, choose Analyze > Manage Groups.
- In the Manage Groups dialog, change the name in the Group Name text box from Group number 1 to PossiblySetosa.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-a1e9cce174.jpg)
- Click New to create a second group.
- Change the name in the Group Name text box from Group number 2 to PossiblyVersicolor.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-3721be46dd.jpg)

Example 34

- Click New to create a third group.
- Change the name in the Group Name text box from Group number 3 to PossiblyVirginica.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-6c5c798fc5.jpg)
- Click Close.


## Specifying the Data File

From the menus, choose File > Data Files.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-88455f4893.jpg)

- Click PossiblySetosa to select that row.
- Click File Name, select the iris3.sav file that is in the Amos Examples directory, and click Open.
- Click Grouping Variable and double-click Species in the Choose a Grouping Variable dialog. This tells the program that the Species variable will be used for classifying flowers.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-0f4098aef3.jpg)

In the Data Files dialog, click Group Value and then double-click setosa in the Choose Value for Group dialog.

## Choose Value for Group

Group: PossiblySetosa
File: iris3.sav
Variable: Species
Cases: 150

| Value | Freq |
| :--- | ---: |
| setosa | 10 |
| versicolor | 10 |
| virginica | 10 |
| Cluster1 | 0 |
|  |  |
|  |  |
|  |  |

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-91490fa792.jpg)

Cancel

No Value
Help

The Data Files dialog should now look like this：

| Data Files |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-a4f84791ad.jpg) |  |  |  |  |  |  |
|  |  |  |  |  |  |  |
| □ <br> File Name <br> File Name |  |  | □ <br> Working File |  |  | □ <br> Help |
| View Data |  |  | Grouping Variable |  |  | □ <br> Group Value |
| OK |  |  |  |  |  | Cancel |
| 「 Allow non－numeric data |  |  | 「 Assign cases to groups |  |  |  |

Repeat the preceding steps for the PossiblyVersicolor group，but this time double－click versicolor in the Choose Value for Group dialog．
－Repeat the preceding steps once more for the PossiblyVirginica group，but this time double－click virginica in the Choose Value for Group dialog．The Data Files dialog will end up looking like this：

| Data Files |  |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| PossiblySetosa iris3．sav Species setosa 10／150 <br> PossiblyVersicolor iris3．sav Species versicolor 10／150 <br> PossiblyVirginica iris3．sav Species virginica 10／150 |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |
|  |  |  | □ <br> Working File |  |  |  | □ <br> Help |
| View Data |  |  | Grouping Variable |  |  |  | □ <br> Group Value |
| OK |  |  | Cancel |  |  |  |  |
| －Allow non－numeric data |  |  | 「 Assign cases to groups |  |  |  |  |

Example 34

So far, the analysis has been set up exactly like an ordinary three-group analysis in which the species of every flower is known. The next step is unique to mixture modeling.

- Select Assign cases to groups (a check mark will appear next to it). The check mark tells Amos to assign a case to a group whenever the dataset does not specify which group that case belongs to.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-fa9d36c88e.jpg)
- Click OK to close the Data Files dialog.


## Specifying the Model

We will use a saturated model for the variables PetalLength and PetalWidth. The scatterplot that was shown earlier suggests that these two variables will allow the program to do a good job of classifying the flowers according to species.

Note that you are not limited to saturated models when doing mixture modeling. You can use a factor analysis model or a regression model or any other kind of model. See Example 36 for a demonstration of mixture modeling with a regression model.

- Draw the following path diagram. (This path diagram is saved as Ex34-a.amw.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-06ba10c981.jpg)
- From the menus, choose View > Analysis Properties.
- Select Estimate means and intercepts (a check mark will appear next to it).
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-6535e82e75.jpg)


## Fitting the Model

- Click on the toolbar.
or
- From the menus, choose Analyze > Bayesian Estimation.

Note: The button is disabled because, in mixture modeling, you can perform only Bayesian estimation.

After the Bayesian SEM window opens, wait until the unhappy face - changes into a happy face $\cdot$. The table of estimates in the Bayesian SEM window should look something like this:

| - Bayesian SEM |  |  |  |  |  |  | □圆 |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| File Edit View Analyze Help <br> ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f93084b7ed.jpg) <br> II - 500+25.290 1113 . 238 <br> (2) 500+24.500 <br> PossiblySetosa \| PossiblyVersicolor \| PossiblyVirginica \| |  |  |  |  |  |  |  |  |
|  | Mean | S.E. | S.D. | C.S. | Skewness | Kurtosis | Min | Max |
| Means |  |  |  |  |  |  |  |  |
| PetalLength | 1.462 | 0.000 | 0.026 | 1.000 | -0.009 | 0.112 | 1.356 | 1.585 |
| PetalWidth | 0.246 | 0.000 | 0.016 | 1.000 | -0.008 | 0.137 | 0.184 | 0.310 |
| Covariances |  |  |  |  |  |  |  |  |
| PetalWidth<->PetalLength | 0.007 | 0.000 | 0.003 | 1.000 | 0.552 | 0.868 | -0.005 | 0.028 |
| Variances |  |  |  |  |  |  |  |  |
| PetalWidth | 0.013 | 0.000 | 0.003 | 1.000 | 1.011 | 2.133 | 0.006 | 0.034 |
| PetalLength | 0.034 | 0.000 | 0.008 | 1.000 | 0.894 | 1.414 | 0.016 | 0.081 |
| PossiblySetosa Proportion |  |  |  |  |  |  |  |  |
| Proportion | 0.333 | 0.000 | 0.037 | 1.000 | 0.121 | 0.003 | 0.192 | 0.492 |

The Bayesian SEM window displays all of the parameter estimates that you would get in an ordinary three-group analysis. The table displays the results for one group at a time. You can switch from one group to another by clicking the tabs at the top of the table. In this example, the model parameters include only means, variances, and covariances. In a more complicated model, there would also be estimates of regression weights and intercepts.

Example 34

In a mixture modeling analysis, you also get an estimate of the proportion of the population that lies in each group. The preceding figure shows that the proportion of setosa flowers in the population is estimated to be 0.333 . (It should be pointed out that it was by design that the sample contained equal numbers of setosa, versicolor, and virginica flowers. It is therefore not meaningful in this example to draw inferences about population proportions from the sample. Nevertheless, we will treat species here as a random variable in order to demonstrate how such inferences can be made.)

To view the posterior distribution of a population proportion, right-click the row that contains the proportion and choose Show Posterior from the pop-up menu.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-1a74e2154e.jpg)

The Posterior window shows that the proportion of flowers that belong to the setosa species is almost certainly between 0.25 and 0.45 . It looks like there is about a $50-50$ chance that the proportion is somewhere between 0.3 and 0.35 .

PossiblySetosa
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-6b3485c5c4.jpg)

## Classifying Individual Cases

To obtain probabilities of group membership for each individual flower:

- Click the Posterior Predictive button 園.
or
- From the menus, choose View > Posterior Predictive.

Example 34

| 总 Posterior Predictive Distributions |  |  |  |  | $\square \square$ |
| :--- | :--- | :--- | :--- | :--- | :--- |
|  | PetalLength | PetalWidth | P(setosa) | P(versicolor) | P(virginica) |
| 47 | 1.6 | 0.2 | $\underline{1.00}$ | 0.00 | 0.00 |
| 48 | 1.4 | 0.2 | $\underline{1.00}$ | 0.00 | 0.00 |
| 49 | 1.5 | 0.2 | $\underline{1.00}$ | 0.00 | 0.00 |
| 50 | 1.4 | 0.2 | $\underline{1.00}$ | 0.00 | 0.00 |
| 51 | 4.7 | 1.4 | 0.00 | $\underline{0.95}$ | 0.05 |
| 52 | 4.5 | 1.5 | 0.00 | $\underline{0.94}$ | 0.06 |
| 53 | 4.9 | 1.5 | 0.00 | 0.87 | 0.13 |
| 54 | 4 | 1.3 | 0.00 | $\underline{0.99}$ | 0.01 |
| 55 | 4.6 | 1.5 | 0.00 | $\underline{0.93}$ | 0.07 |
| 56 | 4.5 | 1.3 | 0.00 | $\underline{0.98}$ | 0.02 |
| 57 | 4.7 | 1.6 | 0.00 | $\underline{0.82}$ | 0.18 |
| 58 | 3.3 | 1 | 0.00 | $\underline{1.00}$ | 0.00 |
| 59 | 4.6 | 1.3 | 0.00 | $\underline{0.96}$ | 0.04 |
| 60 | 3.9 | 1.4 | 0.00 | $\underline{0.97}$ | 0.03 |
| 61 | 3.5 | 1 | 0.00 | $\underline{1.00}$ | 0.00 |
| 62 | 4.2 | 1.5 | 0.00 | $\underline{0.93}$ | 0.07 |
| 63 | 4 | 1 | 0.00 | $\underline{0.99}$ | 0.01 |
| 64 | 4.7 | 1.4 | 0.00 | $\underline{0.95}$ | 0.05 |
| 65 | 3.6 | 1.3 | 0.00 | $\underline{0.98}$ | 0.02 |
| 66 | 4.4 | 1.4 | 0.00 | $\underline{0.98}$ | 0.02 |
| 67 | 4.5 | 1.5 | 0.00 | $\underline{0.94}$ | 0.06 |
| 68 | 4.1 | 1 | 0.00 | $\underline{0.98}$ | 0.02 |
| 69 | 4.5 | 1.5 | 0.00 | $\underline{0.94}$ | 0.06 |
| 70 | 3.9 | 1.1 | 0.00 | $\underline{1.00}$ | 0.00 |
| 71 | 4.8 | 1.8 | 0.00 | 0.26 | $\underline{0.74}$ |
| 72 | 4 | 1.3 | 0.00 | $\underline{0.99}$ | 0.01 |

For each flower, the Posterior Predictive Distributions window shows the probability that that flower is setosa, versicolor, or virginica.

For the first 50 flowers (the ones that actually are setosa), the probability of membership in the setosa group is nearly 1 . We expected that result because the setosa flowers were clearly separated from flowers of other species in the scatterplot shown earlier.

Most of the versicolor flowers (starting with case number 51) were also correctly classified. For example, flower number 51 has posterior probability 0.95 of being versicolor. However, classification errors do occur. Case number 71, for example, is misclassified. It is a versicolor flower, but it is estimated to have a 0.74 probability of being virginica.

## Latent Structure Analysis

It was mentioned earlier that you are not limited to saturated models when doing mixture modeling. You can use a factor analysis model, a regression model, or any model at all. You may want to become familiar with an important variation of the saturated model. Latent structure analysis (Lazarsfeld and Henry, 1968) is a variation of mixture modeling in which the measured variables are required to be independent within each group. When the measured variables are multivariate normal, they are required to be uncorrelated.

- To require that the measured variables be uncorrelated, delete the double-headed arrow in the path diagram of the saturated model. (This path diagram is saved as Ex34b.amw.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-c695426a62.jpg)
- Click the Bayesian button to perform the latent structure analysis. The results of the latent structure analysis will not be presented here.


## Example <br> 35

## Mixture Modeling without Training Data

## Introduction

Mixture modeling is appropriate when you have a model that is incorrect for an entire population, but where the population can be divided into subgroups in such a way that the model is correct in each subgroup.

When Amos performs mixture modeling, it allows you to assign some cases to groups before the analysis starts. Example 34 shows how to do that. In the present example, all cases are unclassified at the start of the mixture modeling analysis.

## About the Data

This example uses the Anderson (1935) iris data that was used in Example 34. This time, however, we will not use the iris3.sav dataset, which contains species information for 30 of the 150 flowers. Instead, we will use the iris2.sav dataset, which contains no species information at all. That is the difference between Example 34 and the present example: In Example 34, some cases were pre-classified; in the present example, no cases are pre-classified. The following figure shows a portion of the iris2.sav dataset.

|  | SepalLength | SepalWidth | PetalLength | PetalWidth | Species |
| :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | 5.1 | 3.5 | 1.4 | . 2 |  |
| 2 | 4.9 | 3.0 | 1.4 | . 2 |  |
| 3 | 4.7 | 3.2 | 1.3 | . 2 |  |
| 4 | 4.6 | 3.1 | 1.5 | . 2 |  |
| 5 | 5.0 | 3.6 | 1.4 | . 2 |  |
| 6 | 5.4 | 3.9 | 1.7 | . 4 |  |
| 7 | 4.6 | 3.4 | 1.4 | . 3 |  |
| 8 | 5.0 | 3.4 | 1.5 | . 2 |  |
| 9 | 4.4 | 2.9 | 1.4 | . 2 |  |
| 10 | 4.9 | 3.1 | 1.5 | . 1 |  |
| 11 | 5.4 | 3.7 | 1.5 | . 2 |  |
| 12 | 4.8 | 3.4 | 1.6 | . 2 |  |
| 13 | 4.8 | 3.0 | 1.4 | . 1 |  |
| 14 | 4.3 | 3.0 | 1.1 | . 1 |  |

Notice that the dataset contains a Species column, even though that column is empty. It is important that the Species column be present even if it contains no values. This is because Amos allows for the possibility that you might already know the species of some cases (as in Example 34). The variable that is used for classifying cases does not actually have to be named Species. Any variable name will do. The variable does, however, have to be a string (non-numeric) variable.

## Performing the Analysis

- From the menus, choose File $>$ New to start a new path diagram.
- From the menus, choose Analyze > Manage Groups.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-ba4221f9d6.jpg)
- Click New to create a second group.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-0ed6ab64d9.jpg)
- Click New once more to create a third group.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-a204b73e4c.jpg)
- Click Close.

Example 35

This example fits a three-group mixture model. When you aren't sure how many groups there are, you can run the program multiple times. Run the program once to fit a two-group model, then again to fit a three-group model, and so on.

## Specifying the Data File

- From the menus, choose File > Data Files.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-d769ea0e6a.jpg)
- Click Group number 1 to select the first row.
- Click File Name, select the iris2.sav file that is in the Amos Examples directory, and click Open.
- Click Grouping Variable and double-click Species in the Choose a Grouping Variable dialog. This tells the program that the Species variable will be used to distinguish one group from another.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-afcb37cf09.jpg)
- Repeat the preceding steps for Group number 2, specifying the same data file (iris2.sav) and the same grouping variable (Species).

Example 35

Repeat the preceding steps once more for Group number 3, specifying the same data file (iris2.sav) and the same grouping variable (Species).

| Data Files |  |  |  |
| :--- | :--- | :--- | :--- |
|  |  |  |  |
|  |  |  | Group Name File Variable Value N <br> Group number 1 iris2.sav Species $150 / 150$  <br> Group number 2 iris2.sav Species $150 / 150$  <br> Group number 3 iris2.sav Species $150 / 150$  |
|  |  |  |  |
| File Name |  | Working File | Help |
| View Data |  | Grouping Variable | Group Value |
| OK |  |  | Cancel |
| 「 Allow non-numeric data |  | ᄃ Assign cases to groups |  |

- Select Assign cases to groups (a check mark will appear next to it).

| Data Files |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- |
| ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-0a386122eb.jpg) |  |  |  |  |  |
|  |  |  |  |  |  |
|  |  |  |  |  |  |
| File Name |  |  | Working File |  | Help |
|  | View Data |  | Grouping Variable |  | Group Value |
|  | OK |  |  |  | Cancel |
|  | ✓ Allow non-numeric data |  |  | ✓ Assign cases to groups |  |

So far, this has been just like any ordinary multiple-group analysis except for the check mark next to Assign cases to groups. That check mark turns this into a mixture modeling analysis. The check mark tells Amos to assign a flower to a group if the grouping variable in the data file does not already assign it to a group. Notice that it was not necessary to click Group Value to specify a value for the grouping variable. The data file contains no values for the grouping variable (Species), so the program automatically constructed the following Species values for the three groups: Cluster1, Cluster2, and Cluster3.

- Click OK to close the Data Files dialog.


## Specifying the Model

We will use a saturated model for the variables PetalLength and PetalWidth. The scatterplot in Example 34 suggests that these two variables will allow the program to do a good job of classifying the flowers according to species.

Note that you are not limited to saturated models when doing mixture modeling. You can use a factor analysis model, a regression model, or any other kind of model. Example 36 demonstrates mixture modeling with a regression model.

- Draw the following path diagram:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-4dea027e05.jpg)
- From the menus, choose View > Analysis Properties.
- Select Estimate means and intercepts (a check mark will appear next to it).
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-6fe8430f8b.jpg)


## Constraining the Parameters

In this example, variances and covariances will be required to be invariant across groups. This is the assumption of homogeneity of variances and covariances that is often made in discriminant analysis and some kinds of clustering. In principle, the assumption of homogeneity of variances and covariances is not necessary in mixture modeling. The reason we will make the assumption here is that, for this example, the algorithm in Amos fails without that assumption. (It should be noted that the scatterplot in Example 34 suggests that the assumption is violated.)

Right-click PetalLength in the path diagram, choose Object Properties from the pop-up menu, and enter the parameter name, v1, in the Variance text box.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-819419ccdd.jpg)

- While the Object Properties dialog is still open, click PetalWidth in the path diagram.

In the Object Properties dialog, enter the parameter name, v2, in the Variance text box.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-996f4b71cd.jpg)

- While the Object Properties dialog is still open, click the double-headed arrow that represents the covariance between PetalLength and PetalWidth.
- In the Object Properties dialog, enter the parameter name, c12, in the Covariance text box.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f48408c163.jpg)

The path diagram should now look like the following figure. (This path diagram is saved as Ex35-a.amw.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-a687030f3f.jpg)

## Fitting the Model

- Click on the toolbar.
or
- From the menus, choose Analyze > Bayesian Estimation.

Note: The button is disabled because, in mixture modeling, you can perform only Bayesian estimation.
After the Bayesian SEM window opens, wait until the unhappy face - changes into a happy face $\cdot$. The table of estimates in the Bayesian SEM window should then look something like this:

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-a7df4e6192.jpg)

The Bayesian SEM window displays all of the parameter estimates that you would get in an ordinary three-group analysis. The table displays the estimates for one group at a time. You can switch from one group to another by clicking the tabs at the top of the table. In this example, the model parameters include only means, variances, and covariances. In a more complicated model, there would also be estimates of regression weights and intercepts.

In a mixture modeling analysis, you also get an estimate of the proportion of the population that lies in each group. In the preceding figure, the proportion of setosa flowers in the population is estimated to be 0.306 .

To view the posterior distribution of a population proportion, right-click the row that contains the proportion and choose Show Posterior from the pop-up menu.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-2faa7e9e22.jpg)

The graph of the posterior distribution in the Posterior window shows that the proportion of flowers that belong in Group number 1 is certainly between 0.15 and 0.45 . There is a very high probability that the proportion is between 0.25 and 0.35 .

Group number 1
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-8170811220.jpg)

## Classifying Individual Cases

To obtain probabilities of group membership for each individual flower:

- Click the Posterior Predictive button 園.
or
- From the menus, choose View > Posterior Predictive.

For each flower, the Posterior Predictive Distributions window shows the probability that the value of the Species variable is Cluster1, Cluster2, or Cluster3.

| 总 Posterior Predictive Distributions |  |  |  |  | ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-2fb4f1d696.jpg) |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
|  | PetalLength | PetalWidth | P(Cluster1) | P(Cluster2) | P(Cluster3) | へ |
| 46 | 1.4 | 0.3 | 0.00 | 0.00 | $\underline{1.00}$ |  |
| 47 | 1.6 | 0.2 | 0.00 | 0.00 | $\underline{1.00}$ |  |
| 48 | 1.4 | 0.2 | 0.00 | 0.00 | $\underline{1.00}$ |  |
| 49 | 1.5 | 0.2 | 0.00 | 0.00 | $\underline{1.00}$ |  |
| 50 | 1.4 | 0.2 | 0.00 | 0.00 | $\underline{1.00}$ |  |
| 51 | 4.7 | 1.4 | 0.01 | $\underline{0.99}$ | 0.00 |  |
| 52 | 4.5 | 1.5 | 0.01 | $\underline{0.99}$ | 0.00 |  |
| 53 | 4.9 | 1.5 | 0.04 | $\underline{0.96}$ | 0.00 |  |
| 54 | 4 | 1.3 | 0.00 | $\underline{1.00}$ | 0.00 |  |
| 55 | 4.6 | 1.5 | 0.02 | $\underline{0.98}$ | 0.00 |  |
| 56 | 4.5 | 1.3 | 0.00 | $\underline{1.00}$ | 0.00 |  |
| 57 | 4.7 | 1.6 | 0.09 | $\underline{0.91}$ | 0.00 |  |
| 58 | 3.3 | 1 | 0.00 | $\underline{1.00}$ | 0.00 |  |
| 59 | 4.6 | 1.3 | 0.00 | $\underline{1.00}$ | 0.00 |  |
| 60 | 3.9 | 1.4 | 0.00 | $\underline{1.00}$ | 0.00 |  |
| 61 | 3.5 | 1 | 0.00 | $\underline{1.00}$ | 0.00 |  |
| 62 | 4.2 | 1.5 | 0.01 | $\underline{0.99}$ | 0.00 |  |
| 63 | 4 | 1 | 0.00 | $\underline{1.00}$ | 0.00 |  |
| 64 | 4.7 | 1.4 | 0.01 | $\underline{0.99}$ | 0.00 |  |
| 65 | 3.6 | 1.3 | 0.00 | $\underline{1.00}$ | 0.00 |  |
| 66 | 4.4 | 1.4 | 0.00 | $\underline{1.00}$ | 0.00 |  |
| 67 | 4.5 | 1.5 | 0.01 | $\underline{0.99}$ | 0.00 |  |
| 68 | 4.1 | 1 | 0.00 | $\underline{1.00}$ | 0.00 |  |
| 69 | 4.5 | 1.5 | 0.01 | $\underline{0.99}$ | 0.00 |  |
| 70 | 3.9 | 1.1 | 0.00 | $\underline{1.00}$ | 0.00 |  |
| 71 | 4.8 | 1.8 | $\underline{0.69}$ | 0.31 | 0.00 |  |
| 72 | 4 | 1.3 | 0.00 | $\underline{1.00}$ | 0.00 |  |
| 73 | 4.9 | 1.5 | 0.04 | $\underline{0.96}$ | 0.00 |  |
| 74 | 47 | 12 | non | 1 по | non |  |

The first 50 cases, which we know to be examples of setosa, are placed in Group number 3 with a probability of 1, so Group number 3 clearly contains setosa flowers. Cases 51 through 100 fall mainly into Group number 2, so Group number 2 clearly contains versicolor flowers. Similarly, although the preceding figure does not show it, cases 101 through 150 are assigned mainly to Group number 1, so Group number 1 clearly contains virginica flowers.

## Latent Structure Analysis

There is a variation of mixture modeling called latent structure analysis in which observed variables are required to be independent within each group.

- To require that PetalLength and PetalWidth be uncorrelated and therefore (because they are multivariate normally distributed) independent, remove the double-headed arrow that connects them in the path diagram. The resulting path diagram is shown here. (This path diagram is saved as the file, Ex35-b.amw.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-a0808b83bb.jpg)
- Optionally, remove the constraints on the variances by deleting the parameter names, $v 1$ and $v 2$. (The resulting path diagram is saved as Ex35-c.amw.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-b5c8c86064.jpg)
- After deleting the double-headed arrow and possibly removing the constraints on the variances, click the Bayesian button to perform the latent structure analysis. The results of the latent structure analysis will not be reported here.


## Label Switching

If you attempt to replicate the analysis in this example, it is possible that you will get the results that are reported here but with the group names permuted. The results reported here for Group number 1 might correspond to the results you get for Group number 2 or Group number 3. This is sometimes called label switching (Chung, Loken, and Schafer, 2004). Label switching is not really a problem unless it occurs during the course of a single analysis. Unfortunately, label switching can in fact occur in the middle of an analysis. When label switching occurs, it is usually revealed by trace plots for individual parameters. To display a trace plot during Bayesian estimation:

- Right-click a parameter in the Bayesian SEM window and choose Show Posterior from the pop-up menu.
- In the Posterior window, select Trace.

Label switching did not occur in the analysis of the present example. The following figure, from another analysis, shows a trace plot that is typical of label switching. This trace plot came from an analysis of data with two clusters of cases. In one cluster, the mean of a variable called $X$ was about 4 . In the other cluster, the mean of the $X$ variable was about 17. The trace plot shows that, in the group called Group number 1, the sampled values of the mean of $X$ stayed close to 4 most of the time until about the 5,000 -th iteration of the MCMC algorithm. At about the 5,000-th iteration, sampled values started fluctuating around 17. This abrupt shift in the trace plot is evidence that the group labels (Group number 1 and Group number 2) were switched at about the 5,000-th iteration. The trace plot shows that this label switching occurred several times during the first 20,000 iterations of the MCMC algorithm.

Group number 1
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-51dd8346fd.jpg)

Label switching can be revealed by a multi-model posterior distribution for one or more parameters. The preceding trace plot corresponds to the following posterior distribution estimate.

Group number 1
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-7db7b33660.jpg)

The preceding graph shows that the mean of a parameter's posterior distribution may not be a meaningful estimate in a mixture modeling analysis when label switching occurs. Some methods for preventing label switching have been proposed (Celeux, Hurn, and Robert, 2000; Frühwirth-Schnatter, 2004; Jasra, Holmes, and Stephens, 2005; Stephens, 2000). Chung, Loken, and Schafer (2004) suggest that pre-assigning even one or two cases to groups can be effective in eliminating label switching. Amos allows pre-assigning cases to groups, as shown in Example 34. Amos does not implement any other method for preventing label switching.

## Mixture Regression Modeling

## Introduction

Mixture regression modeling (Ding, 2006) is appropriate when you have a regression model that is incorrect for an entire population, but where the population can be divided into subgroups in such a way that the regression model is correct in each subgroup.

## About the Data

Two artificial datasets will be used to explain mixture regression.

## First Dataset

The following dataset is in the file DosageAndPerformance1.sav. Dosage is the intensity of some treatment. Performance is just some performance measure. Group is a string (non-numeric) variable whose role in mixture regression analysis will be explained later.

Example 36

|  | dosage | performance | group |
| :--- | :--- | :--- | :--- |
| 1 | 5.26 | -. 35 |  |
| 2 | 2.10 | 4.86 |  |
| 3 | 1.93 | 5.20 |  |
| 4 | 1.21 | 3.50 |  |
| 5 | 1.25 | 2.61 |  |
| 6 | -. 86 | 6.64 |  |
| 7 | 3.85 | 2.31 |  |
| 8 | 2.51 | 2.60 |  |
| 9 | 1.29 | 4.06 |  |
| 10 | . 90 | 6.12 |  |
| 11 | 2.11 | 4.23 |  |
| 12 | . 42 | 6.68 |  |
| 13 | 2.53 | 3.48 |  |
| 14 | . 88 | 6.51 |  |

A scatterplot of dosage and performance shows two distinct groups of people in the sample. In one group, performance improves as dosage goes up. In the other group, performance gets worse as dosage goes up.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-7d4227216e.jpg)

It would be a mistake to try to fit a single regression line to the whole sample. On the other hand, two straight lines, one for each group, would fit the data well. This is a job for mixture regression modeling. A mixture regression analysis would attempt to divide the sample up into groups and to fit a separate regression line to each group.

## Second Dataset

The following dataset, in the file DosageAndPerformance2.sav, also contains data on the variables dosage, performance, and group.

|  | dosage | performance | group |
| :--- | :--- | :--- | :--- |
| 1 | 6.66 | 21.20 |  |
| 2 | 5.66 | 15.70 |  |
| 3 | 6.06 | 19.20 |  |
| 4 | 9.19 | 23.13 |  |
| 5 | 6.94 | 20.99 |  |
| 6 | 5.16 | 18.04 |  |
| 7 | 4.18 | 13.96 |  |
| 8 | 8.08 | 22.52 |  |
| 9 | 4.68 | 14.06 |  |
| 10 | 7.86 | 19.55 |  |
| 11 | 2.76 | 12.97 |  |
| 12 | 3.70 | 11.96 |  |
| 13 | 5.54 | 15.00 |  |
| 14 | 7.06 | 20.14 |  |

Again, a scatterplot of the data shows evidence of two groups, with each group requiring its own regression line. In either group by itself, an increase of one unit in dosage is associated with an increase of about two units in performance, so that the slope of the regression line is about 2 within each group. On the other hand, the two groups have different intercepts. At any particular dosage level, performance is 5 points or so higher in one group than in the other. A mixture regression analysis of this dataset would attempt to divide the sample up into groups and to fit a separate regression line to each group.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-59a6ffa52a.jpg)

## The Group Variable in the Dataset

Both of the datasets just described include a string (non-numeric) variable called group that contains no data. In a mixture regression analysis, Amos will use the group variable to classify individual cases. (The fact that the variable is called group is not important. Any variable name will do; however, it does have to be a string variable.)

If some cases have already been assigned to groups before the analysis starts, you can put the group names in the group column of the dataset. For example, if you know ahead of time (before the mixture regression analysis starts) that the sample contains high performers and low performers and you know that the first two people in the sample are high performers and that the next three people are low performers, then you can enter that information in the group column of the data table in the following way:

|  | dosage | performance | group |
| :--- | :--- | :--- | :--- |
| 1 | 6.66 | 21.20 | high |
| 2 | 5.66 | 15.70 | high |
| 3 | 6.06 | 19.20 | low |
| 4 | 9.19 | 23.13 | low |
| 5 | 6.94 | 20.99 | low |
| 6 | 5.16 | 18.04 |  |
| 7 | 4.18 | 13.96 |  |
| 8 | 8.08 | 22.52 |  |
| 9 | 4.68 | 14.06 |  |
| 10 | 7.86 | 19.55 |  |
| 11 | 2.76 | 12.97 |  |
| 12 | 3.70 | 11.96 |  |
| 13 | 5.54 | 15.00 |  |
| 14 | 7.06 | 20.14 |  |

The program will then use the five cases that have been pre-classified to assist in classifying the remaining cases. Pre-assigning selected individual cases to groups is mentioned here only as a possibility. In the present example, no cases will be preassigned to groups.

## Performing the Analysis

Only the DosageAndPerformance2.sav dataset will be analyzed in this example.

- From the menus, choose File $>$ New to start a new path diagram.
- From the menus, choose Analyze > Manage Groups.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-8b22f0d6ca.jpg)

Click New to create a second group.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-a238a31621.jpg)

- Click Close.

This example fits a two-group mixture regression model. When you aren't sure how many groups there are, you can run the program multiple times. Run the program once to fit a two-group model, then again to fit a three-group model, and so on.

## Specifying the Data File

- From the menus, choose File > Data Files.

| Data Files |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- |
| Group Name File Variable Value N <br> Group number 1     <br> Group number 2 <working> <working>   <br>      |  |  |  |  |
|  |  |  |  |  |
|  |  |  |  |  |
| File Name |  | □ <br> Working File |  | □ <br> Help |
| View Data |  | Grouping Variable |  | □ <br> Group Value |
| OK |  |  |  | □ <br> Cancel |
| □ Allow non-numeric data |  | □ Assign cases to groups |  |  |

- Click Group number 1 to select that row.
- Click File Name, select the DosageAndPerformance2.sav file that is in the Amos Examples directory, and click Open.
- Click Grouping Variable and double-click group in the Choose a Grouping Variable dialog. This tells the program that the variable called group will be used to distinguish one group from another.


## Choose a Grouping Variable

Group: Group number 1 File: dosageandperformance2.sav
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-e66f24673a.jpg)

OK
Cancel

No Variable
Help

- Repeat the preceding steps for Group number 2, specifying the same data file (DosageAndPerformance2.sav) and the same grouping variable (group).


## Data Files

| Group Name | File | Variable | Value | N |
| :--- | :--- | :--- | :--- | :--- |
| Group number 1 | DosageAndPerformance2.sav | group |  | 400/400 |
| Group number 2 | DosageAndPerformance2.sav | group |  | 400/400 |


| File Name | Working File | Help |
| :--- | :--- | :--- |
| View Data | Grouping Variable | Group Value |
| OK |  | Cancel |
| Γ Allow non-numeric data | ᄃ Assign cases to groups |  |

Select Assign cases to groups (a check mark will appear next to it).
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-257155c4d4.jpg)

So far, this has been just like any ordinary multiple-group analysis except for the check mark next to Assign cases to groups. That check mark turns this into a mixture
modeling analysis. The check mark tells Amos to assign a case to a group if the grouping variable in the data file does not already assign it to a group. Notice that it was not necessary to click Group Value to specify a value for the grouping variable. The data file contains no values for the grouping variable (group), so the program automatically constructed values for the group variable: Cluster1 for cases in Group number 1, and Cluster2 for cases in Group number 2.

- Click OK to close the Data Files dialog.


## Specifying the Model

- Draw a path diagram for the regression model, as follows. (This path diagram is saved as Ex36-a.amw.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-47ddef30d3.jpg)
- From the menus, choose View > Analysis Properties.
- Select Estimate means and intercepts (a check mark will appear next to it).
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-c676800054.jpg)


## Fitting the Model

- Click on the toolbar.
or
- From the menus, choose Analyze > Bayesian Estimation.

Note: The button is disabled because, in mixture modeling, you can perform only Bayesian estimation.

Example 36

After the Bayesian SEM window opens, wait until the unhappy face - changes into a happy face $\cdot$. The table of estimates in the Bayesian SEM window should then look something like this:

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-fed1536f78.jpg)

The Bayesian SEM window contains all of the parameter estimates that you would get in an ordinary multiple-group regression analysis. There is a separate table of estimates for each group. You can switch from group to group by clicking the tabs just above the table of estimates.

The bottom row of the table contains an estimate of the proportion of the population that lies in an individual group. The preceding figure, which displays estimates for Group number 1, shows that the proportion of the population in Group number 1 is estimated to be 0.247 . To see the estimated posterior distribution of that population proportion, right-click the proportion's row in the table and choose Show Posterior from the pop-up menu.

| - Bayesian SEM |  |  |  |  |  |  | - 回 |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| File Edit View Analyze Help <br> ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-13a908cc32.jpg) <br> II $500+34.276795 .303$ <br> (중) 500+33.500 <br> Group number 1 \| Group number 2 \| |  |  |  |  |  |  |  |  |
|  | Mean | S.E. | S.D. | C.S. | Skewness | Kurtosis | Min | Max |
| Regression weights |  |  |  |  |  |  |  |  |
| performance<--dosage <br> Means | 2.082 | 0.001 | 0.070 | 1.000 | -0.051 | 0.087 | 1.772 | 2.339 |
|  |  |  |  |  |  |  |  |  |
| dosage | 7.111 | 0.001 | 0.181 | 1.000 | -0.027 | 0.091 | 6.353 | 7.824 |
|  |  |  |  |  |  |  |  |  |
| performance | 5.399 | 0.005 | 0.528 | 1.000 | 0.067 | 0.110 | 3.402 | 7.975 |
| Variances |  |  |  |  |  |  |  |  |
| dosage | 2.905 | 0.003 | 0.458 | 1.000 | 0.626 | 0.762 | 1.679 | 6.024 |
| E1 | 1.026 | 0.001 | 0.174 | 1.000 | 0.671 | 0.920 | 0.544 | 2.184 |
| Group number 1 Proportion |  |  |  |  |  |  |  |  |
| Proportion | 0.247 | 0,000 <br> 0.022 |  | 1 000\| | 0.159 | -0.009 | 0.168 | 0.345 |
|  |  |  |  | ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-d3561e68b5.jpg) |  |  |  |  |

The graph in the Posterior window shows that the proportion of the population in Group number 1 is practically guaranteed to be somewhere between 0.15 and 0.35 .
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f49e2682b3.jpg)

Let's compare the regression weight and the intercept in Group number 1 with the corresponding estimates in Group number 2. In Group number 1, the regression weight estimate is 2.082 and the intercept estimate is 5.399. In Group number 2, the regression weight estimate (1.999) is about the same as in Group number 1 while the intercept estimate (9.955) is substantially higher than in Group number 1.

| ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-4eef29c4da.jpg) <br> Bayesian SEM |  |  |  |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| File Edit View Analyze Help <br> ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-d1db3261d7.jpg) <br> II $500+34.276$ <br> 795 . 303 <br> 500+33,500 <br> Group number 1 Group number 2 |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |
|  | Mean | S.E. | S.D. | C.S. | Skewness | Kurtosis | Min | Max |
| Regression weights |  |  |  |  |  |  |  |  |
| performance<--dosage | 1.999 | 0.000 | 0.033 | 1.000 | -0.027 | 0.069 | 1.864 | 2.139 |
| Means |  |  |  |  |  |  |  |  |
| dosage | 2.132 | 0.001 | 0.110 | 1.000 | 0.007 | 0.086 | 1.603 | 2.632 |
| Intercepts |  |  |  |  |  |  |  |  |
| performance | 9.955 | 0.001 | 0.089 | 1.000 | 0.006 | 0.056 | 9.564 | 10.311 |
| Variances |  |  |  |  |  |  |  |  |
| dosage | 3.468 | 0.002 | 0.293 | 1.000 | 0.323 | 0.129 | 2.462 | 4.959 |
| E1 | 1.026 | 0.001 | 0.090 | 1.000 | 0.334 | 0.191 | 0.708 | 1.432 |
| Group number 2 Proportion |  |  |  |  |  |  |  |  |
| Proportion | 0.753 | 0.000 | 0.022 | 1.000 | -0.159 | -0.009 | 0.655 | 0.832 |

## Classifying Individual Cases

To obtain probabilities of group membership for each individual case:

- Click the Posterior Predictive button 園.
or
- From the menus, choose View > Posterior Predictive.

| 总 Posterior Predictive Distributions |  |  |  | ![](https://ai-docs.amosdevelopment.com/Images/ug/ug-d68d84da73.jpg) |
| :--- | :--- | :--- | :--- | :--- |
|  | performance | dosage | P(Cluster1) | P(Cluster2) |
| 1 | 21.2031284997218 | 6.66331007602931 | $\underline{0.88}$ | 0.12 |
| 2 | 15.6987674948943 | 5.66102565586862 | $\underline{1.00}$ | 0.00 |
| 3 | 19.2048361186385 | 6.05921131399695 | $\underline{0.98}$ | 0.02 |
| 4 | 23.1294140957387 | 9.18741643404798 | $\underline{1.00}$ | 0.00 |
| 5 | 20.9890517035103 | 6.94140644349163 | $\underline{1.00}$ | 0.00 |
| 6 | 18.0430561625521 | 5.16054293700744 | $\underline{0.55}$ | 0.45 |
| 7 | 13.9604306100494 | 4.17650111795259 | $\underline{1.00}$ | 0.00 |
| 8 | 22.5151975204992 | 8.07913550523847 | $\underline{1.00}$ | 0.00 |
| 9 | 14.0616719804701 | 4.67521627594292 | $\underline{1.00}$ | 0.00 |
| 10 | 19.546086381681 | 7.85787684574796 | $\underline{1.00}$ | 0.00 |
| 11 | 12.9679285391251 | 2.7627255307244 | 0.07 | $\underline{0.93}$ |
| 12 | 11.958726583912 | 3.6995932072631 | $\underline{1.00}$ | 0.00 |
| 13 | 15.0037101134325 | 5.53950501114771 | $\underline{1.00}$ | 0.00 |
| 14 | 20.1350290402946 | 7.06075641893955 | 1.00 | 0.00 |

For each case, the Posterior Predictive Distributions window shows the probability that the group variable takes on the value Cluster1 or Cluster2. Case 1 is estimated to have a 0.88 probability of being in Group number 1 and a 0.12 probability of being in Group number 2. Recall that the first group has an intercept of about 5.399 while the second group has an intercept of about 9.955, so Group number 1 is the low performing group. Therefore, there is an 88 percent chance that the first person in the sample is in the low performing group and a 12 percent chance that that person is in the high performing group.

## Improving Parameter Estimates

You can improve the parameter estimates (and also improve Amos's ability to form clusters) by reducing the number of parameters that need to be estimated. As we have seen, the slope of the regression line is about the same for the two groups. Also, the variability about each regression line appears to be about the same for the two groups. It is possible to incorporate into the mixture modeling analysis the hypothesis that the slopes and the error variances are the same for the two groups, thereby reducing the number of distinct parameters to be estimated. To do this:

- On the path diagram, right-click the single-headed arrow that connects dosage and performance, choose Object Properties from the pop-up menu, and enter the parameter name, $b$, in the Regression weight text box.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-13567e590d.jpg)
- While the Object Properties dialog is still open, click E1 in the path diagram.
- In the Object Properties dialog, enter the parameter name, v, in the Variance text box.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-ecfa3987fb.jpg)

The path diagram should now look like the following figure. (This path diagram is saved as Ex36-b.amw.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-379c960792.jpg)

After constraining the slope and error variance to be the same for the two groups, you can repeat the mixture modeling analysis by clicking the Bayesian button . The results of that analysis will not be presented here.

## Prior Distribution of Group Proportions

For the prior distribution of group proportions, Amos uses a Dirichlet distribution with parameters that you can specify. By default, the Dirichlet parameters are $4,4, \ldots$.

To specify the Dirichlet parameters, right-click on a group proportion's estimate in the Bayesian SEM window and choose Show Prior from the pop-up menu.

![](https://ai-docs.amosdevelopment.com/Images/ug/ug-c13f480071.jpg)

## Label Switching

It is possible that the results reported here for Group number 1 will match the results that you get for Group number 2, and that the results reported here for Group number 2 will match those that you get for Group number 1. In other words, your results may match the results reported here, but with the group names reversed. This is sometimes called label switching (Chung, Loken, and Schafer, 2004). Label switching is discussed further at the end of Example 35.

## 37

## Using Amos Graphics without Drawing a Path Diagram

## Introduction

People usually specify models in Amos Graphics by drawing path diagrams; however, Amos Graphics also provides a non-graphical method for model specification. If you don't want to draw a path diagram, you can specify a model by entering text in the form of a Visual Basic or C\# program. In such a program, each object in a path diagram (for example, each rectangle, ellipse, single-headed arrow, double-headed arrow, and figure caption) corresponds to a single program statement. Usually, a program statement is one line of text.

Here are some reasons why you might choose to specify a model by entering text rather than by drawing a path diagram.

- Your model is so big that drawing its path diagram would be difficult.
- You prefer using a keyboard to using a mouse, or prefer working with text to working with graphics.
- You need to generate a lot of similar models that differ only in some detail such as the number of variables or the variable names. If you need to generate such models frequently, it can be efficient to automate the chore by creating a super program whose text output is a tailor-made Visual Basic or C\# program that specifies the particular model that you want Amos to fit.

The present example shows how to specify a model in Amos Graphics by entering text rather than by drawing a path diagram.

## About the Data

The Holzinger and Swineford (1939) dataset from Example 8 is used for this example.

## A Common Factor Model

The factor analysis model from Example 8 is used for this example. Whereas the model was specified in Example 8 by drawing its path diagram, the same model will be specified in the current example by writing a Visual Basic program.

## Creating a Plugin to Specify the Model

From the menus, choose Plugins > Plugins.

In the Plugins dialog, click Create.

## Plugins

Plugin name:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-ce8aa0c13a.jpg)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-bc31895ac5.jpg)

Description:
Draw covariances among selected exogenous variables

The Program Editor window opens.


<System.ComponentModel.Composition.Export(GetType(Amos.IPlugin))>
Public Class CustomCode
Implements Amos.IPlugin
Public Function Name() As String Implements Amos.IPlugin.Name
Return ""
End Function
Public Function Description() As String Implements Amos.IPlugin.Description
Return ""
End Function
Public Function Mainsub() As Integer Implements Amos.IPlugin.Mainsub
End Function

In the Program Editor window, change the Name and Description functions so that they return meaningful strings.


<System.ComponentModel.Composition.Export(GetType(Amos.IPlugin))>
Public Class CustomCode
Implements Amos.IPlugin
Public Function Name() As String Implements Amos.IPlugin.Name
Return "Example 37a"
End Function
Public Function Description() As String Implements Amos.IPlugin.Description
Return "Example 37 from the Amos User's Guide|'
End Function
Public Function Mainsub() As Integer Implements Amos.IPlugin.Mainsub
End Function

You may find it helpful at this point to refer to the first path diagram in Example 8. We are going to add one line to the Mainsub function for each rectangle, ellipse and arrow in the path diagram.

In the Program Editor, enter the line
pd.Observed("visperc")
as the first line in the Mainsub function.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-6533ab6f9f.jpg)

If you save the plugin now, you can use it later on to draw a rectangle representing a variable called visperc. The rectangle will be drawn with arbitrary height and width at a random location in the path diagram. You can specify its height, width and location. For example,
pd.Observed("visperc", 400, 300, 200, 100)
draws a rectangle for a variable called visperc. The rectangle will be centered 400 logical pixels from the left edge of the path diagram, 300 logical pixels from the top edge. It will be 200 logical pixels wide and 100 logical pixels high. (A logical pixel is $1 / 96$ of an inch.) The online help gives other variations of the Observed method.

## Using Amos Graphics without Drawing a Path Diagram

In this example, we will not specify the height, width or location of any path diagram objects.

- Enter the following additional lines in the Mainsub function so that the plugin will draw five more rectangles for the five remaining observed variables:

```
pd.Observed("cubes")
pd.Observed("lozenges")
pd.Observed("paragrap")
pd.Observed("sentence")
pd.Observed("wordmean")
```

```
    Return "Example 37a"
End Function
Public Function Description() As String Implements Amos.IPlugin.Description
    Return "Example 37 from the Amos User's Guide"
End Function
Public Function Mainsub() As Integer Implements Amos.IPlugin.Mainsub
    pd.Observed("visperc")
    pd.Observed("cubes")
    pd.Observed("lozenges")
    pd.Observed("paragrap")
    pd.Observed("sentence")
    pd.Observed("wordmean")
End Function
```

- Enter the following lines so that the plugin will draw eight ellipses for the eight unobserved variables:

```
pd.Unobserved("err_v")
pd.Unobserved("err_c")
pd.Unobserved("err_l")
pd.Unobserved("err_p")
pd.Unobserved("err_s")
pd.Unobserved("err_w")
pd.Unobserved("spatial")
pd.Unobserved("verbal")
```

Enter the following lines so that the plugin will draw the 12 single-headed arrows:

```
pd.Path("visperc", "spatial", 1)
pd.Path("cubes", "spatial")
pd.Path("lozenges", "spatial")
pd.Path("paragrap", "verbal", 1)
pd.Path("sentence", "verbal")
pd.Path("wordmean", "verbal")
pd.Path("visperc", "err_v", 1)
pd.Path("cubes", "err_c", 1)
pd.Path("lozenges", "err_l", 1)
pd.Path("paragrap", "err_p", 1)
pd.Path("sentence", "err_s", 1)
pd.Path("wordmean", "err_w", 1)
```

Notice that in some of the lines above, the Path method has a third argument that is set equal to 1 . This is how you fix a regression weight to a constant value of 1 . See the online help for other variations of the Path method.

- Enter the following line so that the plugin will draw the double-headed arrow:

```
pd.Cov("spatial", "verbal")
```

- Enter the following line to reposition the objects in the path diagram so as to improve its appearance:

```
pd.Reposition()
```

As mentioned above, the simple forms of the Observed, Unobserved and Caption methods that are used in this example place objects at random positions in the path diagram. The Reposition method attempts to make the path diagram look better by rearranging objects. The Reposition method does not produce path diagrams of presentation quality. Far from it, in fact. On the other hand, Reposition usually improves a path diagram's appearance substantially. In order to get objects in the path diagram sized and positioned exactly the way you want, you can use one of the following approaches.

- Specify a height, width and location each time you use the Observed, Unobserved and Caption methods of the pd class. (See the online help for the Observed, Unobserved and Caption methods.)
or
- In your plugin, use the Reposition method to improve the positioning of objects. After running the plugin, use the drawing tools in the Amos Graphics toolbox to interactively move and resize the objects in the path diagram.


## Controlling Undo Capability

- Enter the following line as the first line in the Mainsub function:
pd.UndoToHere
- Enter the following line as the last line in the Mainsub function:
pd.UndoResume
The UndoToHere method and the UndoResume method work together to ensure that the effect of running the plugin can be undone by one click of the Undo button.

The Mainsub function now looks like this in the Program Editor:

```
Public Function Mainsub() As Integer Implements Amos.IPlugin.Mainsub
    pd.UndoToHere
    pd.Observed("visperc")
    pd.Observed("cubes")
    pd.Observed("lozenges")
    pd.Observed("paragrap")
    pd.Observed("sentence")
    pd.Observed("wordmean")
    pd.Unobserved("err_v")
    pd.Unobserved("err_c")
    pd.Unobserved("err_l")
    pd.Unobserved("err_p")
    pd.Unobserved("err_s")
    pd.Unobserved("err_w")
    pd.Unobserved("spatial")
    pd.Unobserved("verbal")
    pd.Path("visperc", "spatial", 1)
    pd.Path("cubes", "spatial")
    pd.Path("lozenges", "spatial")
    pd.Path("paragrap", "verbal", 1)
    pd.Path("sentence", "verbal")
    pd.Path("wordmean", "verbal")
    pd.Path("visperc", "err_v", 1)
    pd.Path("cubes", "err_c", 1)
    pd.Path("lozenges", "err_l", 1)
    pd.Path("paragrap", "err_p", 1)
    pd.Path("sentence", "err_s", 1)
    pd.Path("wordmean", "err_w", 1)
    pd.Cov("spatial", "verbal")
    pd.Reposition()
    pd.UndoResume
End Function
```

This completes the plugin for specifying the factor analysis model from Example 8.
Amos comes with a pre-written copy of the plugin in a file called Ex37a-plugin.vb.
Language-specific versions of this file are saved in the folders
%amosplugins%\Japanese and %amosplugins%VEnglish. You can use one of the prewritten language-specific plugins by copying it to the %amosplugins% folder.

## Compiling and Saving the Plugin

- Click the Check Syntax button on the toolbar in the Program Editor window. Any compilation errors will be displayed on the Syntax errors tab of the Program Editor window.
- After you fix any compilation errors, click Close in the Program Editor window. You will be asked if you want to save the file:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-7d8429471e.jpg)
- Click Yes. The Save As dialog will be displayed.
- In the Save As dialog, type a filename for your plugin and click Save. Your plugin must be saved in the Save As dialog's default folder location. If you have inadvertently changed the folder in the Save As dialog, you can change it back to the default by entering %amosplugins% as the folder name.

After you have saved your plugin, its name, Example 37a, appears on the list of plugins in the Plugins window. (Recall that Example 37a is the string returned by the plugin's Name function.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-b6d6daa09d.jpg)

- Close the Plugins window.


## Using the Plugin

- From the menus, choose File $>$ New to start with an empty path diagram.

If you are asked whether you want to save your work, choose either Yes or No:

- From the menus, choose Plugins > Example 37a. The plugin generates the model's path diagram, which is then displayed in the path diagram window. The following path diagram was generated during the preparation of this example. (You will almost certainly get a different path diagram because a random number generator plays a role in positioning the elements in the path diagram.)
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-41cd9ae266.jpg)


## Other Aspects of the Analysis in Addition to Model Specification

In Example 8, the data file Grnt_fem.sav was specified interactively (by choosing File > Data Files on the menus). You can do the same thing here as well. As an alternative, you can specify the Grnt_fem.sav data file within the plugin by adding the following lines to the Mainsub function:

```
pd.SetDataFile(1, MiscAmosTypes.cDatabaseFormat.mmSPSS,
Environment.GetEnvironmentVariable("examples") & "\grnt_fem.sav", "", "",
"")
```

Similarly, in Example 8, standardized estimates were requested interactively (by choosing View $>$ Analysis Properties on the menus). As an alternative to requesting standardized estimates interactively, you can request them within a plugin by adding the following line to the Mainsub function:

```
pd.GetCheckBox("AnalysisPropertiesForm", "StandardizedCheck").Checked = True
```

Generally, any aspect of an analysis that can be specified interactively can be specified within a plugin by using the methods and properties of the pd class.

## Defining Program Variables that Correspond to Model Variables

There are five pd methods that create an object in a path diagram: Observed, Unobserved, Path, Cov and Caption. Each of these methods returns a reference to the object that it creates. For example, the Observed method creates an observed variable in the path diagram and also returns a reference to that observed variable. Instead of writing the lines

```
pd.Observed("wordmean")
pd.Unobserved("verbal")
```

to create an observed variable called wordmean and an unobserved variable called verbal, you can write the following lines (in Visual Basic):

```
Dim wordmean As PDElement = pd.Observed("wordmean")
Dim verbal As PDElement = pd.Unobserved("verbal")
```


## Using Amos Graphics without Drawing a Path Diagram

Then you can use the program variable wordmean to refer to the model variable called wordmean, and use the program variable verbal to refer to the model variable called verbal. If you want to draw a single-headed arrow from the verbal variable to the wordmean variable, you can write either
pd.Path(wordmean, verbal)
or
pd.Path("wordmean", "verbal")
The advantage of the unquoted version over the quoted version is that, with the unquoted version, typing errors are likely to be detected when you click the Check Syntax button. With the quoted version, typing errors cannot be detected until you use the plugin, if they are detected at all.

The file Ex37b-plugin.vb contains a plugin that has the same functionality as Ex37aplugin.vb. The difference is that Ex37b-plugin.vb uses Visual Basic variables to refer to model variables. Language-specific versions of Ex37b-plugin.vb are saved in the folders %amosplugins%Vapanese and %amosplugins%\English. You can use one of the pre-written language-specific plugins by copying it to the %amosplugins% folder.

# Simple User-Defined Estimands I 

## Introduction

This example shows how to estimate user-defined functions of model parameters along with bootstrap standard errors, confidence intervals, and significance tests. In this example, a single user-defined function is estimated-an indirect effect.

The example demonstrates a simplified approach to the estimation of user-defined functions of parameters. The simplified version is limited to estimands that can be defined by a single expression. A more general version (not demonstrated here) of Amos's user-defined estimand capability allows the estimands to be defined by a program of arbitrary length and complexity. The more general version is documented in the online help under the topic "CValue Class Reference" and is demonstrated in videos at http://amosdevelopment.com/features/user-defined/user-definedgeneral/index.html.

## The Wheaton Data Revisited

Example 6 described three alternative models for the Wheaton et al. (1977) data. Here, we re-examine Model B from Example 6. The following path diagram, which is in the file Ex38.amw, shows Model B from Example 6, with some parameter names added.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-b103362679.jpg)

## Estimating an Indirect Effect

Five of the regression weights in this model have been named $A, B, C, D$, and $E$, in order to make it easy to discuss the indirect effect of ses on powles71. There are two such indirect effects: the product $A B$ and the product $C D B$. You can estimate the sum of the two indirect effects, $A B+C D B$, by clicking View $>$ Analysis Properties $>$ Output and putting a check mark next to Indirect, direct \& total effects. This capability is built into Amos and does not require you to specify a user-defined estimand. Suppose, however, that you want to estimate both of the individual indirect effects, $A B$ and $C D B$, as well as their sum. All three can be estimated as user-defined estimands in the following way.

- Click Not estimating any user-defined estimand on the status bar in the lower-left corner of the Amos Graphics window. Then click Define new estimands from the pop-up menu.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-46dd262cdf.jpg)

In the new window that opens, enter three lines to define three custom estimands:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-b6da8b5eac.jpg)

The names of the three custom estimands are Indirect_AB, Indirect_CDB and Sum. You can make up other names instead. Names for estimands must be made up of letters of the alphabet, numbers, and the underscore character. The first character must be alphabetic. Uppercase and lowercase are not distinguished, so that if you call an estimand $A b c$ you cannot call another estimand $a b c$.

The two-character sequence "p." is used as a prefix for parameter names. For example, "p.A" means "the parameter named A." The "p." prefixes can usually be omitted with some improvement in readability as shown here:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-5e73cbf6bd.jpg)

One benefit of using the "p." prefix is that typing "p." displays a list of parameter names that you can choose from. In the following screenshot, double-clicking A in the parameter list has the same effect as typing "A" on the keyboard,
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-c63c5d2509.jpg)

There is one situation where you must use the "p.": If you have a parameter named $A$ and also a variable named $A$, then typing a plain "A" will be ambiguous. You will in that case have to type "p.A" for the parameter called $A$, or "v.A" for the variable called $A$,

- Optionally, add lines and comments, as shown here:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f2cf9999b5.jpg)
- Click the Close button.
- Click Yes in the following dialog.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-5b9e3319f7.jpg)
- In the Save As dialog, type indirect effects in the File name box. Then click the Save button.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-4b0e9c3572.jpg)
- Click View $>$ Analysis Properties $>$ Bootstrap, and put check marks next to Perform bootstrap and Bias-corrected confidence intervals. Also, since the data file contains sample moments and not raw data, put a check mark next to Monte Carlo (parameteric bootstrap).
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-99edfc03c1.jpg)
- Click Analyze > Calculate Estimates.
- Click View > Text Output.


## Simple User-Defined Estimands I

In the Amos Output window, double-click Estimates, then double-click Scalars, then click User-defined estimands.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f199e03b7d.jpg)

The estimand called Indirect_ $A B$ is estimated to be -0.205 . This is the product of the regression weight $A(-0.212)$ and the regression weight $B(0.971)$.

- Click Bootstrap standard errors.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-8ac25c3ee1.jpg)

Indirect_ $A B$ is approximately normally distributed with a standard error of about 0.048.

- Click Bootstrap Confidence.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-ef72d4f43b.jpg)

The population value of the Indirect_ $A B$ is between -0.283 and -0.118 with $90 %$ confidence. The estimate of -0.205 has a $p$ value of 0.013 . It is significantly different from zero at the 0.05 level but not at the 0.01 level.

## Estimating the Indirect Effect without Naming Parameters

If you plan on estimating a function of some parameters, it helps to name those parameters, as was done above. However, you don't have to name the parameters. The following steps show how to estimate the same indirect effect that we just estimated but without making use of parameter names.

- Click Estimating Simple indirect effect on the status bar in the lower-left corner of the Amos Graphics window. Then click Edit Simple indirect effects on the menu that pops up.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f75fd92c15.jpg)

Wherever a parameter is referred to by name, substitute a description of the parameter as follows.

- Change "p.A" to "e.DirectEffect(alienation71,ses)"
- Change "p.B" to "e.DirectEffect(powles71,alienation71)"
- Change "p.C" to "e.DirectEffect(alienation67,ses)"
- Change "p.D" to "e.DirectEffect(alienation71,alienation67)"

After these substitutions, the specification for the custom estimands looks like this:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-6e36cdc31f.jpg)

- Close the window.
- Click Yes in the dialog that appears.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-1f091604bc.jpg)

Yes
No

- Click Analyze $>$ Calculate Estimates.
- Click View > Text Output. (The text output is the same as before.)


## Example <br> 39

## Simple User-Defined Estimands II

## Introduction

This example shows how to estimate the difference between two standardized regression weights, along with a bootstrap standard error, a confidence interval, and a significance test for the difference.

## About the Data

Four quizzes were administered to a class of 39 students. The quizzes were approximately equally spaced throughout the semester. The file QuizComplete.txt contains the scores of the 22 students who took all four quizzes.

## A Markov Model

The file Ex39.amw contains the following Markov model for scores on the four quizzes.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-f3059af844.jpg)

The following path diagram shows the standardized regression weights estimated for this model.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-ccc1f6978c.jpg)

Let's compare two standardized regression weights, say the weight for using $q 2$ to predict $q 3$, and the weight for using $q 3$ to predict $q 4$. The difference between the two estimates is about $0.39-0.35=0.04$. Let's also get a standard error for that difference, along with a confidence interval and significance test for the difference.

- Click Not estimating any user-defined estimand on the status bar in the lower-left corner of the Amos Graphics window. Then click Define new estimands on the menu that pops up.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-ffe5ba457e.jpg)
- In the window that opens, enter one line to specify the new estimand, as follows:
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-3e9a24aac1.jpg)

You can choose a name other than StandardizedWeightDiff if you wish.

- Click the Check Syntax button on the toolbar. If you have made no typing mistakes, the message "Syntax is OK" will be displayed in the Description box.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-a98c82a413.jpg)
- Close the window.
- Click Yes in the following dialog.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-e49c50483e.jpg)

In the Save As dialog, type StandardizedDifference in the File name box. Then click the Save button.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-a78b74df42.jpg)

- Click View $>$ Analysis Properties $>$ Bootstrap, and put check marks next to Perform bootstrap and Bias-corrected confidence intervals.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-d02eb11f68.jpg)
- Click Analyze $>$ Calculate Estimates.
- Click View > Text Output.
- In the Amos Output window, double-click Estimates, then double-click Scalars, then click User-defined estimands.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-620408f63d.jpg)

The estimand called StandardizedWeightDiff is estimated to be 0.047 .

- Click Bootstrap standard errors.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-af6448b427.jpg)

The difference is approximately normally distributed with a standard error of about 0.426 .

- Click Bootstrap Confidence.
![](https://ai-docs.amosdevelopment.com/Images/ug/ug-42452c9a89.jpg)

The population value of the difference is between -0.679 and 0.688 with $90 %$ confidence. The estimate of 0.047 is not significantly different from zero at any conventional significance level ( $p=0.934$ ).

## Notation

$q=$ the number of parameters
$\gamma=$ the vector of parameters (of order $q$ )
$G=$ the number of groups
$N^{(g)}=$ the number of observations in group $g$
$N=\sum_{g=1}^{G} N^{(g)}=$ the total number of observations in all groups combined
$p^{(g)}=$ the number of observed variables in group $g$
$p^{*(g)}=$ the number of sample moments in group $g$. When means and intercepts are explicit model parameters, the relevant sample moments are means, variances, and covariances, so that $p^{*(g)}=p^{(g)}\left(p^{(g)}+3\right) / 2$. Otherwise, only sample variances and covariances are counted so that $p^{*(g)}=p^{(g)}\left(p^{(g)}+1\right) / 2$.
$p=\sum_{g=1}^{G} p^{*(g)}=$ the number of sample moments in all groups combined
$d=p-q=$ the number of degrees of freedom for testing the model
$x_{i r}^{(g)} \quad=$ the $r$-th observation on the $i$-th variable in group $g$
$\mathbf{x}_{r}^{(g)}=$ the $r$-th observation in group $g$
$\mathbf{S}^{(g)}=$ the sample covariance matrix for group $g$
$\Sigma^{(g)}(\gamma)=$ the covariance matrix for group $g$, according to the model
$\mu^{(g)}(\gamma)=$ the mean vector for group $g$, according to the model
$\Sigma_{\mathbf{0}}^{(g)}=$ the population covariance matrix for group $g$
$\mu_{\mathbf{0}}^{(g)}=$ the population mean vector for group $g$
$\mathbf{s}^{(g)}=\operatorname{vec}\left(\mathbf{S}^{(g)}\right)=$ the $p^{*(g)}$ distinct elements of $\mathbf{S}^{(g)}$ arranged in a single column vector
$\sigma^{(g)}(\gamma)=\operatorname{vec}\left(\Sigma^{(g)}(\gamma)\right)$
$r=$ the non-negative integer specified by the ChiCorrect method. By default $r=G$. When the Emulisrel6 method is used, $r=G$ and cannot be changed by using ChiCorrect.
$n=N-r$
$\mathbf{a}=$ the vector of order $p$ containing the sample moments for all groups; that is, $\mathbf{a}$ contains the elements of $\mathbf{S}^{(1)}, \ldots, \mathbf{S}^{(G)}$ and also (if means and intercepts are explicit model parameters) $\overline{\mathbf{x}}^{(1)}, \ldots, \overline{\mathbf{x}}^{(G)}$.
$\mathrm{a}_{\mathbf{0}}=$ the vector of order $p$ containing the population moments for all groups; that is, $\mathrm{a}_{0}$ contains the elements of $\Sigma_{0}^{(1)}, \ldots, \Sigma_{0}^{(G)}$ and also (if means and intercepts are explicit model parameters) $\mu_{\mathbf{0}}^{(1)}, \ldots, \mu_{\mathbf{0}}^{(G)}$. The ordering of the elements of $\mathrm{a}(\gamma)$ must match the ordering of the elements of $\mathbf{a}$.
$\mathrm{a}(\gamma)=$ the vector of order $p$ containing the population moments for all groups according to the model; that is, $\mathrm{a}(\gamma)$ contains the elements of $\Sigma^{(1)}(\gamma), \ldots, \Sigma^{(G)}(\gamma)$ and also (if means and intercepts are explicit model parameters) $\mu^{(1)}(\gamma), \ldots, \mu^{(G)}(\gamma)$. The ordering of the elements of a $(\gamma)$ must match the ordering of the elements of $\mathbf{a}$.
$F(\mathrm{a}(\gamma), \mathbf{a})=$ the function (of $\boldsymbol{\gamma}$ ) that is minimized in fitting the model to the sample
$\hat{\gamma}=$ the value of $\boldsymbol{\gamma}$ that minimizes $F(\mathrm{a}(\gamma), \mathbf{a})$
$\hat{\Sigma}^{(g)}=\Sigma^{(g)}(\hat{\gamma})$
$\hat{\mu}^{(g)}=\mu^{(g)}(\hat{\gamma})$
$\hat{\mathrm{a}}=\mathrm{a}(\hat{\gamma})$