Inferential Statistics in SPSS: The General Linear Model

Whether or not there is a relationship between variables can be answered in two ways. For example, if each of two variables provides approximately normally distributed data with five or more levels, based on Fig. 6.1 and Table 6.2, the statistic to use is either the Pearson correlation or bivariate (simple) regression, and that would be our recommendation. Instead, some researchers choose to divide the independent variable into two or more categories or groups, such as low, medium, and high, and then do a one-way ANOVA.

Conversely, in a second example, others who start with an independent variable that has only a few (say two through four ordered categories) may choose to do a correlation instead of a one­way ANOVA. Although these choices are not necessarily wrong, we do not think they are usually the best practice. In the first example, information is lost by dividing a continuous independent variable into a few categories. In the second example, there would be a restricted range, which tends to decrease the size of the correlation coefficient.

In the previous examples, we recommended one of the choices, but the fact that there are two choices raises a bigger and more complex issue. Statisticians point out, and can prove mathematically, that the distinction between difference and associational statistics is an artificial one. Figure 6.2 shows that, although we make a distinction between difference and associational inferential statistics, they both serve the purpose of exploring and describing (top box) relationships and both are subsumed by the general linear model (middle box).

Statisticians state that all common parametric statistics are relational. Thus, the full range of methods used to analyze one continuous dependent variable and one or more independent variables, either continuous or categorical, are mathematically similar. The model on which this is based is called the general linear model. The relationship between the independent and dependent variables can be expressed by an equation with weights for each of the independent/predictor variables plus an error term.

The bottom part of Fig. 6.2 indicates that a t test or one-way ANOVA with a dichotomous independent variable is analogous to bivariate regression. Finally, as shown in the lowest row of boxes in Fig. 6.2, a one-way or factorial ANOVA can be computed mathematically as a multiple regression with multiple dichotomous predictors (dummy variables). Note in Fig. 6.1 and Tables 6.1 and 6.3 that SPSS uses the GLM program to perform a variety of statistics, including factorial ANOVA and MANOVA.

Fig. 6.2. A general linear model diagram of the selection of inferential statistics.

Although we recognize that our distinction between difference and associational parametric statistics is a simplification, we think it is useful because it corresponds to a distinction made by most researchers. We hope that this glimpse of an advanced topic is clear and helpful.

Source: Morgan George A, Leech Nancy L., Gloeckner Gene W., Barrett Karen C.

(2012), IBM SPSS for Introductory Statistics: Use and Interpretation, Routledge; 5th edition; download Datasets and Materials.

Leave a Reply

Your email address will not be published. Required fields are marked *