Comparison Tests in the Research

Researchers in management often find they want to test hypotheses. For example, a researcher may want to test the hypothesis that the introduction of formalized strategic planning improves the financial performance of banks (Robinson and Pearce, 1983). Comparison tests are powerful tools for testing such a research hypothesis.

Statistical tests generally fall into one of two categories: parametric tests and non-parametric tests. Aside from a number of differences, which we will look at later in this chapter, parametric and non-parametric tests essentially share the same basic logic. This common logic is presented in the first section of this chapter. The second and third sections then describe how these two types of statistical tests can be applied as part of the research process. Both of these sections are organized according to questions that commonly arise during the research process, along with appropriate statistical tests. So that readers can use this chapter to go directly to the test corresponding to their research question, I have systematically repeated information about the conditions under which each test should be applied, together with the rules to follow when deciding upon a test. The conclusion to the chapter presents an effective strategy with which to approach statistical testing.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Statistical Tests in the Research

In this section we explore the general context in which researchers carry out statistical tests. We define the essential concepts involved and outline the gene­ral steps followed when using this type of test.

1. Inference and Statistics

Inference has a very important place in management research. Conclusions or generalizations often have to be established on the basis of observations or results, and in some cases statistics can add to their precision. As inference is often at the heart of the reasoning by which the statistician generalizes from a sample to a population, the branch of statistics devoted to this type of approach is called inferential statistics. The goal of inferential statistics is to evaluate hypotheses through information collected from a selected sample. Statistical tests are thus at the very core of inferential statistics.

2. Research Hypotheses

Research hypotheses are unproven statements about a research field. They may be based on existing theoretical material, previous empirical results, or even personal impressions or simple conjecture. For instance, one of Robinson and Pearce’s (1983: 201) research hypotheses was ‘Banks engaging in formal plan­ning will have a significantly higher mean performance ranking than non­formal planning banks from 1977 and 1979’. If researchers want to use statistical tests to prove a research hypothesis, they must first translate the hypothesis into a statistical hypothesis.

3. Statistical Hypotheses

A statistical hypothesis is a quantified statement about the characteristics of a population. More precisely, it describes the distribution of one or more random variables. It might describe the parameters of a given distribution, or a proba­bility distribution of an observed population.

A statistical hypothesis is generally presented in two parts: the null hypothesis and the alternative or contrary hypothesis. These two hypotheses are incompatible. They describe two complementary states. The null hypothesis describes a situa­tion in which there is no major shift from the status quo, or there is an absence of difference between parameters. The alternative hypothesis – that there is a major shift from the status quo, or that there is a difference between parameters – is generally the hypothesis the researcher wishes to establish. In this case the alter­native hypothesis corresponds to the research hypothesis – the researcher believes it to be true (Sincich, 1996). The researcher’s goal is then to disprove the null hypothesis in favor of the alternative hypothesis (Sincich, 1996; Zikmund, 1994).

The null hypothesis is generally marked H0 and the alternative (or contrary) hypothesis H1 or Ha. It is important to bear in mind that statistical tests are designed to refute and not to confirm hypotheses. In other words, these tests do not aim to prove hypotheses, and do not have the capacity to do so. They can only show that the level of probability is too low for a given statement to be accepted (Kanji, 1993). For this reason, statistical hypotheses are normally formu­lated so that the alternative hypothesis H1 corresponds to the research hypo­thesis one is trying to establish. In this way, rather than attempting to prove that this hypothesis is correct, the goal becomes to reject the null hypothesis.

4. Statistical Tests

Statistical tests are used to assess the validity of statistical hypotheses. They are carried out on data collected from a representative sample of the studied popu­lation. A statistical test should lead to rejecting or accepting an initial hypothesis: in most cases the null hypothesis.

Statistical tests generally fall into one of two categories: parametric tests and non-parametric tests.

Statistical tests were first used in the experimental sciences and in manage­ment research. For instance, the Student test was designed by William Sealy Gosset (who was known as ‘Student’), when working with Guinness breweries. But the mathematical theory of statistical tests was developed by Jerzy Neyman and Egon Shape Pearson. These two authors also stressed the importance of considering not only the null hypothesis, but also the alternative hypothesis (Lehmann, 1991).

In a statistical test focusing on one parameter of a population, for instance its mean or variance, if the null hypothesis H0 states one value for the parameter, the alternative hypothesis H1 would state that the parameter is different from this specific value.

In a statistical test focusing on the probability distribution of a population, if the null hypothesis H0 states that the population follows a specific distribution, for example a normal distribution, the alternative hypothesis H1 would state that the population does not follow this specific distribution.

4.1. A single population

The number of populations being considered will influence the form of the statistical test. If a single population is observed, the test may compare a para­meter 0 of the population to a given value 0O.

For example, Robinson and Pearce (1983: 201) hypothesized that companies with a formal planning policy will perform better than companies without such a policy. In this case the statistical test used would be a one-tail test to the right (a right-tail test). If the hypothesis had suggested that inferior performance resulted from formal planning, a one-tail test to the left (a left-tail test) would be used. If, however, the two authors had simply hypothesized that formal planning would lead to a difference in performance, without specifying in which direction, a two-tail test would have been appropriate. These three alter­native hypotheses would be formulated as follows:

where 0 corresponds to the performance of companies with a formal planning policy, and 00 to the performance of companies without such a policy.

Sometimes, the null hypothesis can also be expressed as an inequation. This gives the following hypotheses systems:

In these cases, the symbols ‘<‘ (less than or equal to) and ‘>’ (greater than or equal to) are used in formulating the null hypothesis H0 in order to cover all cases in which the alternative hypothesis HP is not valid.

However, the general convention is to express H0 as an equation. The rea­soning behind this convention is the following: if the alternative hypothesis is stated as an inequation, for instance Hp θ > θ0, then every test leading to a rejec­tion of the null hypothesis H0: θ = θ0 and therefore to the acceptance of the alter­native hypothesis Hp θ > θ0, would also lead to the rejection of every hypothesis H0: θ = θi, for every θi inferior to θ0. In other words, H0: θ = θ0 represents the most unfavorable situation possible (from the researcher’s point of view) if the alter­native hypothesis Hp θ>θ0 turned out to be incorrect. For this reason, express­ing the null hypothesis as an equation covers all possible situations.

4.2. Two populations

When a statistical test focuses on the parameters of two populations, the goal is to find out whether the two populations described by a specific parameter are different. If θ1 and θ2 represent the parameters of the two populations, the null hypothesis predicts the equality of these two parameters:

The alternative hypothesis may be expressed in three different ways:

4.3. More than two populations

A statistical test on k populations aims to determine if these populations differ on a specific parameter. θ1 θ2, … , θk are the k parameters describing the k popu­lations being compared. The null hypothesis would be that the k parameters are identical:

The alternative hypothesis would then be formulated as follows:

This means that the null hypothesis will be rejected in favor of the alterna­tive hypothesis if the value for any single parameter is found to be different from that of another.

5. Risk of Error

Statistical tests are used to give an indication as to what decision to make – for instance, whether to reject or not to reject the null hypothesis H0. As this deci­sion is, in most research situations, based on partial information only, derived from observations made of a sample of the population, a margin of error is involved (Sincich, 1996; Zikmund, 1994). A distinction is made between two types of error in statistical tests: error of the first type, or Type I error (a), and error of the second type, or Type II error (P).

5.1. Type I and Type II error

By observing the sample group, a researcher may be lead mistakenly to reject the null hypothesis – when in fact the population actually fulfills the conditions of this hypothesis. Type I error (a) measures the probability of rejecting the null hypothesis when it is in fact true. Conversely, a researcher’s observations of a sample may not allow the null hypothesis to be rejected, when in fact the popula­tion actually satisfies the conditions of the alternative hypothesis. Type II error (P) measures the probability of not rejecting the null hypothesis when it is in fact false.

As the null hypothesis may be true or false and the researcher may reject it or not reject it, there are only four possible and mutually exclusive outcomes of a statistical test. These are presented in Table 11.1.

Only two of the four cases in Table 11.1 involve an error. Type I error can only appear if the null hypothesis is rejected. In the same way, Type II error is only possible if the null hypothesis is not rejected. The two types of error can never be present at the same time.

It is tempting to choose a minimal value for Type I error a. Unfortunately, decreasing this value increases Type II error, and vice versa. The only way to minimize both a and P is to use a larger sample (Sincich, 1996). Otherwise, a compromise must be sought between a and P, for example by measuring the power of the test.

When using statistical tests, it is preferable not to speak of accepting the null hypothesis, but rather of not rejecting it. This semantic nuance is important: if the aim of the test was to accept H0, the validity of its conclusion would be measured by Type II error P – the probability of not rejecting the null hypothesis when it is false. However, the value of P is not constant. It varies depending on the specific values of the parameter, and is very difficult to calculate in most statistical tests (Sincich, 1996). Because of the difficulty of calculating P, making a decision based upon the power of the test or the effectiveness curve can be tricky.

There is actually another, more practical solution; to choose a null hypothesis in which a possible Type I error a would be much more serious than a Type II
error P. For example, to prove a suspect guilty or innocent, it may be preferable to choose as the null hypothesis ‘the suspect is innocent’ and as the alternative hypothesis ‘the suspect is guilty’. Most people would probably agree that in this case, a Type I error (convicting someone who is innocent) is more serious than a Type II error (releasing someone who is guilty). In such a context, the researcher may be content to minimize the Type I error a.

5.2. Significance level

Before carrying out a test, a researcher can determine what level of Type I error will be acceptable. This is called the significance level of a statistical test.

A significance level is a probability threshold. A significance level of 5 per cent or 1 per cent is common in management research. In determining a significance level, a researcher is saying that if Type I error – the probability of wrongly rejecting a null hypothesis – is found to be greater than this level, it will be considered significant enough to prevent the null hypothesis H0 from being rejected.

In management research, significance level is commonly marked with aste­risks. An example is the notation system employed by Horwitch and Thietart (1987): p < 0.10*; p < 0.05**; p < 0.01***; p < 0.001****, where one asterisk corres­ponds to results with a 10 per cent significance level, two asterisks 5 per cent, three asterisks 1 per cent and four asterisks 0.1 per cent (that is, one in one thousand). If no asterisk is present, it means the results are not significant.

6. To Reject or not to Reject?

6.1. Statistic X

The decision to reject or not to reject the null hypothesis H0 is based on the value of a relevant statistic – which we refer to as a statistic X. A statistic X is a random variable, appropriate to the null hypothesis H0. It is calculated from data collected from one or more representative samples from one or more popu­lations (Kanji, 1993). A statistic X may be quite simple, such as the mean or the variance, or it may be a complex function of these and other parameters. We will look at different examples later in this chapter.

A good statistic should present three characteristics (Kanji, 1993):

  1. It must behave differently according to whether H0 is true (and H1 false) or vice versa.
  2. Its probability distribution when H0 is true must be known and calculable.
  3. Tables defining this probability distribution must be available.

6.2. Rejection region, acceptance region, and critical value

The set of values of the statistic X that lead to rejecting the null hypothesis is called the rejection region (or the critical region). The complementary region is called the acceptance region (or, more accurately, the non-rejection region). The value representing the limit of the rejection region is called the critical value. In a one-tail test, there is only one critical value Xc, while in a two-tail test, there are two; Xc1 and Xc2. The acceptance region and the rejection region are both dependant on Type I error a, since a is the probability of rejecting H0 when it is actually true, and 1-a is the probability of not rejecting H0 when it is true. This relationship is illustrated by Figure 11.1.

Rules for rejecting or not rejecting the null hypothesis

  • For a left-tail test, the null hypothesis is rejected for any value of the statistic X lower than the critical value Xc. The rejection region therefore comprises the values of X that are ‘too low’.
  • For a two-tail test, the null hypothesis H0 is rejected for values of the statistic X that are either less than the critical value Xc1 or greater than the critical value Xc2. The rejection region here comprises the values of X that are either ‘too low’ or ‘too high’.
  • Finally, for a right-tail test, the null hypothesis is rejected for any value of the statistic X greater than the critical value Xc. The rejection region in this case comprises the values of X that are ‘too high’.

6.3. P-value

Most statistical analysis software programs provide one very useful piece of information, the p value (also called the observed significance level). The p value is the probability, if the null hypothesis were true, of obtaining a value of X as extreme as the one found for a sample. The null hypothesis H0 will be rejected if the p value is lower than the pre-determined significance level a (Sincich, 1996).

It is becoming increasingly common to state the p value of a statistical test in published research articles (for instance Horwitch and Thietart, 1987). Readers can then compare the p value with different significance levels and see for themselves if the null hypothesis should be rejected or not. The p value also locates the statistic X in relation to the critical region (Kanji, 1993). For instance, if some data indicates that the null hypothesis H0 should not be rejected, the p value may be just below the chosen significance level; while if the data provides solid reasons to reject the null hypothesis, the p value would be noticeably lower than the significance level.

7. Defining a Statistical Test – Step-by-Step

Statistical tests on samples generally follow the following method.

In practice, the task is much easier than this. Most statistical analysis software (SAS, SPSS, etc.) determine the statistic X appropriate to the chosen test, calculate its value, and indicate the p value of the test. Some programs, such as Statgraphics, will even go one step further and suggest whether the null hypothesis should be rejected or not according to the significance level fixed by the researcher.

The major difficulty researchers face, though, is to choose the right test. The two following sections provide guidelines for making this choice, first looking at parametric and then at non-parametric tests. The tests are presented in terms of research aims and the research question, and the conditions attached to the application of each test are enumerated.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Parametric Tests in the Research

1. Tests on Means

1.1. Comparing the mean m of a sample to a reference value μ0 when the variance σ2 of the population is known

Research question Does the mean m, calculated from a sample taken from a population with a known variance of σ2, differ significantly from a hypothetical mean μ0?

Application conditions

  • The variance σ2 of the population is known (this is a very rare case!) but the mean m is not known (the hypothesis suggests it equals m 0).
  • A random sample is used, containing n independent observations
  • The size n of the sample should be greater than 5, unless the population fol­lows a normal distribution, in which case the size has no importance. Actually, this size condition is to guarantee the mean of the sample follows a normal distribution (Sincich, 1996).

Hypotheses

The null hypothesis to be tested is H0: μ = μ0,

the alternative hypothesis is H1: μ # μ0 (for a two-tail test)

or Hp μ < μ0 (for a left-tail test)

or Hp μ # μ0 (for a right-tail test)

Statistic The statistic calculated is

It follows a standard normal distribution (mean = 0 and standard deviation = 1). This is called a z test or a z statistic.

Interpreting the test

  • In a two-tail test, H0 will be rejected if Z < – Za/2 or Z > Za/2
  • in a left-tail test, H0 will be rejected if Z < – Za
  • in a right-tail test, H0 will be rejected if Z > Za
    where a is the defined significance level (or Type I error), and Za and Za/2 are the normal distribution values that can be found in the appropriate tables.

1.2. Comparing the mean m of a sample to a reference value when μ0 when the variance σ2 of the population is nknown

Research question Does the mean m, calculated from a sample taken from a population with an unknown variance σ2, differ significantly from a hypothe­tical mean μ0?

Application conditions

  • The variance σ2 of the population is not known and has to be estimated from the sample. The mean m is also unknown (the hypothesis suggests it equals m 0).
  • A random sample is used, containing n independent observations.
  • The size n of the sample is greater than 30, unless the population follows a normal distribution, in which case the size has no importance.

Hypotheses

Null hypothesis, H0: μ = μ0,

the alternative hypothesis is H1: μ # μ0 (for a two-tail test)

or Hp μ < μ0 (for a left-tail test)

or Hp μ # μ0 (for a right-tail test)

Statistic The unknown variance (o2) of the population is estimated from the sample, with n – 1 degrees of freedom, using the formula

The statistic calculated is

It follows Student’s distribution with n – 1 degrees of freedom, and is called a t test or a t statistic.

Interpreting the test When n is large, that is, greater than 30, this statistic approximates a normal distribution. In other words:

The decision to reject or not to reject the null hypothesis can therefore be made by comparing the calculated statistic T to the normal distribution values. We recall that the decision-making rules for a standard normal distribution are:

  • in a two-tail test, H0 will be rejected if Z <- Za/2 or Z > Za/2
  • in a left-tail test, H0 will be rejected if Z <- Za
  • in a right-tail test, H0 will be rejected if Z > Za.

where a is the significance level (or Type I error) and Za and Za/2 are normal distribution values, which can be found in the appropriate tables.

But for smaller values of n, that is, lower than 30, the Student distribution (at n – 1 degrees of freedom) cannot be replaced by a normal distribution. The decision rules then become the following:

  • In a two-tail test, H0 will be rejected if T <- Ta/2; n _ 1 or T > Ta/2; n _ 1
  • in a left-tail test, H0 will be rejected if T <- Ta. n _ 1
  • in a right-tail test, H0 will be rejected if T > Ta. n _ 1.

Example: Comparing a mean to a given value (variance is unknown)

This time, the sample is much larger, containing 144 observations. The mean found in this sample is again m _ 493. The estimated standard deviation is s _ 46.891. Is it still possible to accept a mean in the population p0 _ 500, if we adopt a significance level a of 5 per cent?

The relatively large size of the sample (n _ 144, and so is greater than 30) justifies the approximation of the statistic T to a normal distribution. Thus

which gives -1.79. The tables provide values of Z0.025 _ 1.96 and Z0.05 _ 1.64.

Two-tail test: As – Za/2 < T < Za/2 ( -1.96 < -1.79 < 1.96), the null hypothesis, stat­ing that the mean of the population equals 500 (p = 500), cannot be rejected.

Left-tail test: as T < – Za (-1.79 < -1.64), the null hypothesis is rejected, to the benefit of the alternative hypothesis, according to which the mean in the population is less than 500 (p < 500).

Right-tail test: as T < Za (-1.79 < 1.64), the null hypothesis cannot be rejected.

As mentioned above, the major difficulty researchers face when carrying out comparison tests is to choose the appropriate test to prove their research hypo­thesis. To illustrate this, in the following example we present the results obtained by using proprietary computer software to analyze the above situation. For this example we will use the statistical analysis program, Statgraphics, although other software programs provide similar information.

The software carries out all the calculations, and even indicates the decision to make (to reject the null hypothesis or not) on the basis of the statistic T and the significance level a set by the researcher. In addition, it provides the p value, or the observed significance level. We have already mentioned the importance of the p value, which can provide more detailed information, and so refine the decision. In the first test (two-tail test), the null hypothesis is not rejected for a significance level of 5 per cent, but it would have been rejected if the Type I error risk had been 10 per cent. The p value (0.0753462) is greater than 5 per cent but less than 10 per cent. In the second test (left-tail test), the null hypothesis is rejected for a significance level of 5 per cent, while it would not have been rejected for a significance level of 1 per cent. The p value (0.0376731) is less than 5 per cent but greater than 1 per cent. Finally, in the last test (right-tail test), examining the p value (0.962327) suggests that there are good reasons for not rejecting the null hypothesis, as it is well above any acceptable significance level.

1.3. Comparing the difference of two means to a given value, when the variance of the two populations is known

Research question Is the difference between the two means p1 and p2 of two populations of known variances o2 and o2 significantly different from a given value D0 (for instance zero)?

Application conditions

  • The variances o2 and o2 of the two populations are known. The means p1 and p2 are unknown.
  • Both samples are random and contain n1 and n2 independent observations respectively.
  • Either the mean for each population follows a normal distribution, or the size of each sample is greater than 5.

Hypotheses

Null hypothesis, H0: μ1 – μ2 = D0,

the alternative hypothesis is H1: μ1 – μ2 # D0 (for a two-tail test)

or Hp μ1 – μ2 < D0 (for a left-tail test)

or Hp μ1 – μ2 > D0 (for a right-tail test)

Statistic

The statistic calculated is

x1i = the value of variable X for observation i in population 1,
x2i = the value of variable X for observation i in population 2,

and

is the standard deviation of the difference (m1-m2).

Z follows a standard normal distribution.

Interpreting the test The decision rules are the following:

  • In a two-tail test, H0 will be rejected if Z < -Za/2 or Z > Za/2
  • in a left-tail test, H0 will be rejected if Z < – Za
  • in a right-tail test, H0 will be rejected if Z > Za.

1.4. Comparing the difference of two means to a given value, when the two populations have the same variance, but its exact value is unknown

Research question Is the difference between the two means m1 and m2 of two populations that have the same unknown variance σ2 significantly different from a given value D0 (for instance zero)?

Application conditions

  • The two populations have the same unknown variance (σ2). The two means μ1 and μ2 are not known.
  • Both samples are random and contain n1 and n2 independent observations respectively.
  • Either the mean for each population follows a normal distribution, or the size of each sample is greater than 30.
  • The hypothesis of equality of the variances is verified (see Section 3.2 of this chapter).

Hypotheses

Null hypothesis, H0: μ1 – μ2 = D0,

the alternative hypothesis is H1: μ1 – μ2 # D0 (for a two-tail test)

or Hp μ1 – μ2 < D0 (for a left-tail test)

or Hp μ1 – μ2 > D0 (for a right-tail test)

Statistic

The statistic calculated is:

x1i = the value of the observed variable X for observation i in population 1,

x2i = the value of the observed variable X for observation i in population 2,

This statistic follows Student’s t distribution with n1 + n2 – 2 degrees of freedom.

Interpreting the test The decision rules are the following:

  • in a two-tail test, H0 will be rejected if T <- Ta/2; n1 + n2 _ 2 or T > Ta/2; n1 + n2 _ 2
  • in a left-tail test, H0 will be rejected if T <- T n1+n2 _ 2
  • in a right-tail test, H0 will be rejected if T > T n1 + n2 _ 2.

When the sample size is sufficiently large (that is, n1 > 30 and n2 > 30), the distribution of the statistic T approximates a normal distribution, in which case

The decision to reject or not to reject the null hypothesis can then be made by comparing the calculated statistic T to the normal distribution. In this case, the following decision rules should apply:

  • In a two-tail test, H0 will be rejected if T <- Za/2 or Z > Za/2
  • in a left-tail test, H0 will be rejected if T <- Za
  • in a right-tail test, H0 will be rejected if T > Za.

1.5. Comparing two means, when the variances of the populations are unknown and differ

Research question Do the two means μ1 and μ2 of two populations of unknown variances and differ significantly from each other?

Application conditions

  • The variances σ1 and σ2 of the two populations are unknown and different. The means μ1 and μ2 are unknown.
  • Both samples are random and contain n1 and n2 independent observations respectively.
  • For both populations, the mean follows a normal distribution.
  • The two samples are of practically the same size.
  • At least one of the samples contains fewer than 20 elements.
  • The hypothesis of the inequality of variances is fulfilled (see Section 3.2 of this chapter).

Hypotheses

Null hypothesis, H0: μ1 = μ2,

the alternative hypothesis is H1: μ1 # μ2 (for a two-tail test)

or Hp μ1 < μ2 (for a left-tail test)

or Hp μ1 > μ2 (for a right-tail test)

Statistic Using the same notations as in Section 1.4 of this chapter, the statistic calculated is:

This statistic T’ is called an Aspin-Welch test. It approximates a Student T distribution in which the number of degrees of freedom v is the closest integer value to the result of the following formula:

Interpreting the test The decision rules then become the following:

  • in a two-tail test, H0 will be rejected if T’ <- Ta/2; v or T’ > Ta/2; v
  • in a left-tail test, H0 will be rejected if T’ <- Ta; v
  • in a right-tail test, H0 will be rejected if T > Ta. v.

1.6. Comparing k means (μk): analysis of variance

Research question Do k means, m1, m2… , mk observed on k samples differ sig­nificantly from each other?

This question is answered through an analysis of variance (Anova).

Application conditions

  • All k samples are random and contain n1, n2… nk independent observations.
  • For all k populations, the mean approximates a normal distribution with the same, unknown, variance (σ2).

Hypotheses

Null hypothesis, H0: π1 = π2 = … = πk,

alternative hypothesis, Hp the values of πi (i = 1, 2, … , k) are not all identical. This means that one single different value would be enough to reject the null hypothesis, thus validating the alternative hypothesis.

Statistic The statistic calculated is

The statistic F follows a Fisher distribution with k – 1 and n – k degrees of free­dom, where n is the total number of observations.

Interpreting the test The decision rule is the following:

H0 will be rejected if F > Fk _ 1; n _ k.

The analysis of variance (Anova) can be generalized to the comparison of the mean profiles of k groups according to j variables Xj. This analysis is called a Manova, which stands for ‘multivariate analysis of variance’. Like the Anova, the test used is Fisher’s F test, and the decision rules are the same.

1.7. Comparing k means (μk): multiple comparisons

Research question Of k means m1, m2… , mk observed on k samples, which if any differ significantly?

The least significant difference test, or LSD, is used in the context of an Anova when examination of the ratio F leads to the rejection of the null hypo­thesis H0 of the equality of means, and when more than two groups are present. In this situation, a classic analysis of variance will only furnish global informa­tion, without indicating which means differ. LSD tests, such as the Scheffe, Tukey, or Duncan tests, compare the groups two by two. These tests are all included in the major statistical analysis computer programs.

Application conditions

  • All k samples are random and contain n1, n2… nk independent observations.
  • For all k populations, the mean approximates a normal distribution with the same, unknown, variance (o2).

Hypotheses

Null hypothesis, H0: π1 = π2 = … = πk,

alternative hypothesis, H1: the values of πi (i = 1, 2, … , k) are not all identical. This means that one single different value would be enough to reject the null hypothesis, thus validating the alternative hypothesis.

 Statistic

The statistic calculated is

where Yi. is the mean of Group i, Y. the mean of group j, ni the number of observations of group i, nj the number of observations of group j, and S2 the esti­mation of the variance within each group. This statistic T. follows a Student distribution with n – k degrees of freedom, where n is the total number of obser­vations. This signifies that the mean of all the two-by-two combinations of the k groups will be compared.

Interpreting the test The decision rule is the following:

H0 will be rejected if one Tj is greater than Ta/2; n _k. When Tij > Ta/2; n _ k, the dif­ference between the means Yi. and Y- of the two groups i and j in question is judged to be significant.

1.8. Comparing k means (μk): analysis of covariance

Research question Do k means m1, m2… , mk observed on k samples differ sig­nificantly from each other?

Analysis of covariance permits the differences between the means of dif­ferent groups to be tested, taking the influence of one or more metric vari­ables, Xj called concomitants, into account. This essentially involves carrying out a linear regression to explain the means in terms of the X- concomitant variables, and then using an analysis of variance to examine any differences between groups that are not explained by the linear regression. Analysis of covariance is therefore a method of comparing two or more means. Natur­ally, if the regression coefficients associated with the explicative concomitant metric variables are not significant, an analysis of variance will have to be used instead.

Application conditions

  • All k samples are random and contain n1, n2… nk independent observations.
  • For all k populations, the mean approximates a normal distribution with the same, unknown, variance (o2).
  • The choice of structure of the k groups should not determine the values of the concomitant metric variables.

Hypotheses

Null hypothesis, H0: π1 = π2 = … = πk,

alternative hypothesis, Hp the values of πi (i = 1, 2, … , k) are not all identical. This means that one single different value would be enough to reject the null hypothesis, thus validating the alternative hypothesis.

Statistic

where the explained variance is the estimation made from the sample of the variance between the groups, and the residual variance is the estimation of the variance of the residues. This statistic F follows a Fisher distribution, with k – 1 and n – k – 1 degrees of freedom, n being the total number of observations.

Interpreting the test The decision rule is the following:

H0 will be rejected if F > Fk _ 1; n _k _ 1. The statistic F and the observed significance level are automatically calculated by statistical analysis software.

The analysis of covariance (Ancova) can be generalized to the comparison of the mean profiles of k groups according to j variables X. This is called a Mancova, for ‘multivariate analysis of covariance’. The test used (that is, Fisher’s F test) and the decision rules remain the same.

1.9. Comparing two series of measurements: the hotelling T2 test

Research question Do the mean profiles of two series of k measurements (m1, m2 … , mk) and (m’v m’2, … , mf), observed for two samples, differ significantly from each other?

The Hotelling T2 test is used to compare any two matrices or vectors, particu­larly correlation matrices, variance or covariance matrices, or mean-value vectors.

Application conditions

  • The two samples are random and contain n1 and n2 independent observa­tions respectively.
  • The different measurements are independent and present a normal, multi­variate distribution.

Hypotheses

Null hypothesis, H0: the two series of measurements present the same profile, alternative hypothesis, H1: the two series of measurements present different profiles.

Statistic

The statistic calculated is

where T2 is Hotelling’s T2, k the number of variables, and n1 and n2 are the number of observations in the first and second sample.

The statistic F follows a Fisher distribution with k and n1 + n2 – k – 1 degrees of freedom.

Interpreting the test The decision rule is the following:

H0 will be rejected if  F > Fk _ 1; n1 + n2 – k – 1.

2. Proportion Tests

2.1. Comparing a proportion or a percentage to a reference value π0: binomial test

Research question Does the proportion p, calculated on a sample, differ signi­ficantly from a hypothetical proportion π0?

Application conditions

  • The sample is random and contains n independent observations.
  • The distribution of the proportion in the population is binomial.
  • The size of the sample is relatively large (greater than 30).

Hypotheses

Null hypothesis, H0: π = π0,

alternative hypothesis, H1: π # π0 (for a two-tail test)

or H1: π < π0 (for a left-tail test)

or H1: π > π0(for a right-tail test).

Statistic

The statistic calculated is

It follows a standard normal distribution.

Interpreting the test The decision rules are the following:

  • in a two-tail test, H0 will be rejected if Z <- Za/2 or Z > Za/2
  • in a left-tail test, H0 will be rejected if Z < – Za
  • in a right-tail test, H0 will be rejected if Z > Za.

2.2. Comparing two proportions or percentages p1 and p2 (with large samples)

Research question Do the two proportions or percentages p1 and p2, observed in two samples, differ significantly?

Application conditions

  • The two samples are random and contain n1 and n2 independent observa­tions respectively.
  • The distribution of the proportion in each population is binomial.
  • Both samples are large (n1 > 30 and n2 > 30).

Hypotheses

Null hypothesis, H0: π1 = π2,

alternative hypothesis, H1: π1 # π2 (for a two-tail test)

or H1: π1 < π2 (for a left-tail test)

or H1: π1 > π2 (for a right-tail test).

Statistic

The statistic calculated is

It follows a standard normal distribution.

Interpreting the test The decision rules are the following:

  • in a two-tail test, H0 will be rejected if Z <- Za/2 or Z > Za/2
  • in a left-tail test, H0 will be rejected if Z <- Za
  • in a right-tail test, H0 will be rejected if Z > Za.

2.3. Comparing k proportions or percentages pk (large samples)

Research question Do the k proportions or percentages p1, p2… , pk, observed in k samples, differ significantly from each other?

Application conditions

  • The k samples are random and contain n1 n2… nk independent observations.
  • The distribution of the proportion or percentage is binomial in each of the k
  • All the samples are large (n1, n2… and nk > 50).
  • The k proportions pk as well as their complements 1 – pk represent at least five observations, that is: pkxnk > 5 and (1 – pk) xnk > 5

Hypotheses

Null hypothesis, H0: π1 = π2 = … = πk,

alternative hypothesis, H1: the values of πi (i = 1, 2, … , k) are not all identical. This means that one single different value would be enough to reject the null hypothesis, thus validating the alternative hypothesis.

Statistic

The statistic calculated is

where xj = the number of observations in the sample j corresponding to the pro­portion pj, and

The statistic % follows a chi-square distribution, with k – 1 degrees of freedom.

Interpreting the test The decision rule is the following:

H0 will be rejected if

3. Variance Test

3.1. Comparing the variance σ2 to a reference value σ20

Research question Does the variance s2 calculated from a sample differ signifi­cantly from a hypothetical variance σ20 ?

Application conditions

  • The sample is random and contains n independent observations.
  • The distribution of the variance in the population is normal, the mean and standard deviation are not known.

Hypotheses

Null hypothesis, H0: σ2 = σ20,

alternative hypothesis, H1 σ2 # σ20 (for a two-tail test)

or H1 σ2 < σ20 (for a left-tail test)

or H1 σ2 > σ20 (for a right-tail test).

Statistic

The statistic calculated is

where σ20 is the given variance, s2 the variance estimated from the sample, and m the mean estimated from the sample. The statistic % follows a chi-square dis­tribution with n – 1 degrees of freedom, which is written x2(n – 1).

Interpreting the test The decision rules are the following:

3.2. Comparing two variances

Research question Do the variances o0 and o2 of two populations differ significantly from each other?

Application conditions

  • Both samples are random and contain n1 and n2 independent observations respectively.
  • The distribution of the variance of each population is normal, or the samples are large (n1 > 30 and n2 > 30).

Hypotheses

Null hypothesis, H0: σ21 = σ22,

alternative hypothesis, H1 σ21 # σ22 (for a two-tail test)

or H1 σ21 < σ22 (for a left-tail test)

or H1 σ21 > σ22 (for a right-tail test).

Statistic

The statistic calculated is

Where

and

x1i = the value of variable X for observation i in population 1,

x2i = the value of variable X for observation i in population 2,

x1 = the mean estimated from the sample of variable X in population 1,

x2 = the mean estimated from the sample of variable X in population 2.

If required, the numbering of the samples may be inverted to give the numerator the greater of the two estimated variances,

The statistic F follows a Fisher-Snedecor distribution, F(n1 – 1, n2 – 1). Interpreting the test The decision rules are the following:

  • In a two-tail tesb H0 wib be rejected if F > Fa/2. n1 _ 1, n2 – 1 or F < F1 – a/2; n1 – 1, n2 – 1
  • in a left-tail test, H0 will be rejected if F > Fa. n2 _ 1, n1 _ 1
  • in a right-tail test, H0 will be rejected if F > Fa. n1 _ 1, n2 _ 1.

3.3. Comparing k variances: Bartlett test

Research question Do k variances σ21 , σ22 , ….σ2k observed in k samples, differ significantly from each other?

Application conditions

  • All k samples are random and contain n1, n2… nk independent observations.
  • The distribution of the variance of all k populations is normal.
  • None of the observed variances equals zero.

Hypotheses

Null hypothesis, H0:

alternative hypothesis, H1: the values of of (i = 1, 2, … , k) are not identical

Statistic

The statistic calculated is

Xj = the value of variable X for observation j in population i,

xi = the mean of variable X in population i, estimated from a sample of size ni,

s2 = the variance of variable X in population i, estimated from a sample of size n.

The statistic % follows a chi-square distribution with v degrees of freedom.

Interpreting the test The decision rule is the following:

H0 will be rejected if

3.4. Comparing k variances: Cochran test

Research question Do k variances σ21 , σ22 , ….σ2k observed in k samples, differ significantly from each other?

More precisely, the Cochran test examines if the greatest of the k variances is significantly different to the k – 1 other variances

Application conditions

  • The k samples are random and contain a same number n of independent observations.
  • The variance in each of the k populations follows a normal distribution, or at least a uni-modal distribution.

Hypotheses

Null hypothesis, H0: σ21 22 , =…σ2k

alternative hypothesis, H1: the values of σ2i (i = 1, 2, … , k) are not identical.

Statistic

The statistic calculated is

where the values of s2i 2 are the estimated variances calculated with n = n – 1 degrees of freedom, and S2 max is the greatest estimated variance within the k samples.

The statistic C is compared to the critical value Ca, as read from a table.

Interpreting the test The decision rule is the following: H0 will be rejected if C > Ca.

4. Correlation Tests

4.1. Comparing a linear correlation coefficient r to zero

Research question Is the linear correlation coefficient r of two variables X and Y significant – that is, not equal to zero?

Application conditions

  • The observed variables X and Y are, at least, continuous variables.

Hypotheses

Null hypothesis, H0: p = 0,

alternative hypothesis, H1: p # 0 (for a two-tail test)

or H1: p < 0 (for a left-tail test)

or H1: p > 0 (for a right-tail test)

Statistic

The statistic calculated is

It follows a Student distribution with n – 2 degrees of freedom.

Interpreting the test The decision rules are the following:

  • in a two-tail test, H0 will be rejected if T <- Ta/2; n _ 2 or T > Ta/2; n _ 2
  • in a left-tail test, H0 will be rejected if T <- Ta. n _ 2
  • in a right-tail test, H0 will be rejected if T > Ta. n _ 2.

For large values of n (n – 2 > 30), this statistic approximates a standard nor­mal distribution. The decision to reject or not to reject the null hypothesis can then be made by comparing the calculated statistic T to values of the standard normal distribution, using the decision rules that have already been presented earlier in this chapter.

4.2. Comparing a linear correlation coefficient r to a reference value p0

Research question Is the linear correlation coefficient r of two variables X and Y, calculated from a sample, significantly different from a hypothetical refer­ence value r 0?

Application conditions

  • The observed variables X and Y are, at least, continuous variables.

Hypotheses

Null hypothesis, H0: p = p0,

alternative hypothesis, H1. p # p0 (for a two-tail test)

or Hy. p < p0 (for a left-tail test)

or Hy p > p0 (for a right-tail test)

Statistic

The statistic calculated is

It follows a standard normal distribution.

Interpreting the test The decision rules are the following:

  • in a two-tail test, H0 will be rejected if Z <- Za/2 or Z > Za/2
  • in a left-tail test, H0 will be rejected if Z <- Za
  • in a right-tail test, H0 will be rejected if Z > Za.

4.3. Comparing two linear correlation coefficients, p 1 and p2

Research question Do the two linear correlation coefficients p 1 and p2 differ significantly from each other?

Application conditions

  • The two linear correlation coefficients r1 and r2 are obtained from two samples of size n1 and n2 respectively.

Hypotheses

Null hypothesis, H0: p 1 = p 2,

alternative hypothesis, Hy p 1 # p2 (for a two-tail test)

or Hy p 1 < p2 (for a left-tail test)

or Hy p 1 > p2 (for a right-tail test).

Statistic

The statistic calculated is

It follows a standard normal distribution.

Interpreting the test The decision rules are thus the following:

  • in a two-tail test, H0 will be rejected if Z <- Za/2 or Z > Za/2
  • in a left-tail test, H0 will be rejected if Z <- Za
  • in a right-tail test, H0 will be rejected if Z > Za.

5. Regression Coefficient Tests

5.1. Comparing a linear regression coefficient b to zero

Research question Is the linear regression coefficient P of two variables X and Y significant, that is, not equal to zero?

Application conditions

  • The observed variables X and Y are, at least, continuous variables.
  • P follows a normal distribution or the size n of the sample is greater than 30.

Hypotheses

Null hypothesis, H0: P = 0,

alternative hypothesis, H1: P # 0 (for a two-tail test)

or H1: P < 0 (for a left-tail test)

or H1: P > 0 (for a right-tail test).

Statistic

The statistic calculated is

where b represents the regression coefficient P and sh its standard deviation, both estimated from the sample. The statistic T follows a Student distribution with n – 2 degrees of freedom.

Interpreting the test The decision rules are the following:

  • in a two-tail test, H0 will be rejected if T <- Ta/2; n _ 2 or T > Ta/2; n _ 2
  • in a left-tail test, H0 will be rejected if T <- Ta. n _ 2
  • in a right-tail test, H0 will be rejected if T > Ta. n _ 2.

5.2. Comparing a linear regression coefficient β to a reference value β0

Research question Is the linear regression coefficient b of two variables X and Y significantly different from a reference value β0?

Application conditions

  • The observed variables X and Y are, at least, continuous variables.
  • P follows a normal distribution or the size n of the sample is greater than 30.

Hypotheses

Null hypothesis, H0: β = β0,

alternative hypothesis, Hp β # β0 (for a two-tail test)

or Hp β < β0 (for a left-tail test)

or Hp β > β0 (for a right-tail test).

Statistic

The statistic calculated is

where b represents the regression coefficient P and s_ its standard deviation, both estimated from the sample. The statistic T follows a Student distribution with n – 2 degrees of freedom.

Interpreting the test The decision rules are the following:

  • in a two-tail test, H0 will be rejected if T <- Ta/2; n _ 2 or T > Ta/2; n _ 2
  • in a left-tail test, H0 will be rejected if T <- Ta. n _ 2
  • in a right-tail test, H0 will be rejected if T > Ta. n _ 2.

5.3. Comparing two linear regression coefficients β and β in two populations

Research question Do the two linear regression coefficients β and β’, observed in two populations, differ significantly from each other?

This is again a situation in which the difference between two means b and _ with estimated variances s2b, and s2b, is tested. Naturally we must distinguish between cases in which the two variances are equal and cases in which they are not equal. If these variances differ, an Aspin-Welch test will be used.

Application conditions

  • β and β’ represent the values of the regression coefficient in two popula­tions, from which two independent, random samples have been selected.
  • The observed variables X and Y are, at least, continuous variables.

Hypotheses

Null hypothesis, H0: β = β’,

alternative hypothesis, H1 : β # β’ (for a two-tail test)

or H1 : β < β’ (for a left-tail test)

or H1 : β > β’  (for a right-tail test).

Statistic

The statistics calculated and the interpretations of the tests are the same as for the tests on differences of means, described in parts 1.1 through 1.5 in this chapter.

In addition, it is possible to use the same kind of tests on a constant (P 0) of the linear regression equation. However, this practice is rarely used due to the great difficulties in interpreting the results. Similarly, more than two regression coefficients may be compared. For instance, the Chow test (Chow, 1960; Toyoda, 1974), which uses the Fisher-Snedecor distribution, is used to determine whether the coefficients of a regression equation are the same in two or more groups. This is called an omnibus test, which means that it tests whether the full set of equation coefficients are identical.

When comparing two groups, a neat and quite simple alternative to the Chow test is the introduction of a ‘dummy variable’ to the regression, indicat­ing the group it belongs to. The original variables are multiplied by the dummy variable thus obtaining a set of new variables. The coefficients of the dummy variable represent the differences between the constants (P 0) for the two groups, and the coefficients of the new variables represent the differences between the coefficients of the explicative variables for the two groups. These coefficients can then be tested globally (as the Chow test does) or individually (see Sections 5.1 through 5.3 in this chapter) to identify which coefficient behaves differently according to the group.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Practicing Non-Parametric Tests in the Research

Non-parametric tests use statistics (that is, calculations) that have been estab­lished from observations, and do not depend on the distribution of the cor­responding population. The validity of non-parametric tests depends on very general conditions that are much less restrictive than those that apply to para­metric tests.

Non-parametric tests present a number of advantages, in that they are applicable to:

  • small samples
  • various types of data (nominal, ordinal, intervals, ratios)
  • incomplete or imprecise data.

1. Testing One Variable in Several Samples

1.1. Comparing a distribution to a theoretical distribution: goodness-of-fit test

Research question Does the empirical distribution De, observed in a sample, differ significantly from a reference distribution Dr?

Application conditions

  • The sample is random and contains n independent observations arranged into k
  • A reference distribution, Dr, has been chosen (normal distribution, Chi- square, etc.).

Hypotheses

Null hypothesis, H0: De = Dr,

alternative hypothesis, H1: De # Dr

Statistic

The statistic calculated is

where, for each of the k classes, Ot is the number of observations made on the sample and Tt is the theoretical number of observations calculated according to the reference distribution Dr.

The statistic c follows a chi-square distribution with k – 1 – r degrees of free­dom, where r is the number of parameters of the reference distribution that have been estimated with the aid of observations.

Interpreting the test The decision rule is the following:

H0 will be rejected if

1.2. Comparing the distribution of a variable X in two populations (Kolmogorov-Smirnov test)

Research question Is the variable X distributed identically in two populations, A and B?

The Kolmogorov-Smirnov test may also be used to compare an observed distribution to a theoretical one.

Application conditions

  • The two samples are random and contain nA and nB independent observa­tions, taken from populations A and B
  • The variable X is an interval or a ratio variable of any distribution.
  • The classes are defined in the same way in both samples.

Hypotheses

Null hypothesis, H0: The distribution of variable X is identical in A and B, alternative hypothesis, H1: The distribution of variable X is different in A and B.

Statistic

The statistic calculated is:

where FA(x) and FB(x) represent the cumulated frequencies of the classes in A and in B. These values are compared to the critical values d0 of the Kolmogorov-Smirnov table.

Interpreting the test The decision rule is the following: H0 will be rejected if d > d0.

1.3. Comparing the distribution of a variable X in two populations: Mann-Whitney U test

Research question Is the distribution of the variable X identical in the two populations A and B?

Application conditions

  • The two samples are random and contain nA and nB independent observa­tions from populations A and B respectively (where nA > nB). If required, the notation of the samples may be inverted.
  • The variable X is ordinal.

Hypotheses

Null hypothesis, H0: The distribution of the variable X is identical in A and B, alternative hypothesis, Hp The distribution of the variable X is different in A and B.

Statistic

(A1, A2, …, AnA) is a sample, of size nA, selected from population A, and (B1, B2, …, BnB) is a sample of size nB selected from population B. N = nA + nB observations are obtained, which are classed in ascending order regardless of the samples they are taken from. Each observation is then given a rank: 1 for the smallest value, up to N for the greatest.

The statistic calculated is:

where RA is the sum of the ranks of the elements in A, and RB the sum of the ranks of the elements in B. The statistic U is compared to the critical values Ua of the Mann-Whitney table.

Interpreting the test The decision rule is the following: H0 will be rejected if U < Ua.

For large values of nA and nB (that is, > 12),

tends rapidly toward a standard normal distribution. U’ may then be used, in association with the decision rules of the standard normal distribution for rejecting or not rejecting the null hypothesis.

1.4. Comparing the distribution of a variable X in two populations: Wilcoxon test

Research question Is the distribution of the variable X identical in the two populations A and B?

Application conditions

  • The two samples are random and contain nA and nB independent observa­tions from populations A and B
  • The variable X is at least ordinal.

Hypotheses

Null hypothesis, H0: The distribution of variable X is identical in A and B, alternative hypothesis, H1: The distribution of variable X is different in A and B.

Statistic

(A1, A2, …, AnA) is a sample, of size nA, selected from population A, and (B1, B2, …, BnB) is a sample of size nB selected from population B. N = nA + nB observations are obtained, which are classed in ascending order regardless of the samples they are taken from. Each observation is then given a rank: 1 for the smallest value, up to N for the greatest.

The statistic calculated is:

where R(A) is the rank of observation A;, i = 1,2, …, nA and

the sum of all the ranks of the observations from sample A.

The statistic T is compared to critical values Ra available in a table.

Interpreting the test The decision rule is the following:

H0 will be rejected if R < Ra.

If N is sufficiently large (that is, n > 12), the distribution of T approximates a standard normal distribution, and the corresponding decision rules apply, as described above. When a normal distribution is approximated, a mean rank can be associated with any equally placed observations, and the formula becomes:

where g is the number of groups of equally-placed ranks and tt is the size of group i.

1.5. Comparing variable distribution in two populations: testing homogenous series

Research question Is the distribution of the variable X identical in the two populations A and B?

Application conditions

  • The two samples are random and contain nA and nB independent observa­tions from the populations A and B
  • The variable X is at least ordinal.

Hypotheses

Null hypothesis, H0: The distribution of the variable X is identical in A and B, alternative hypothesis, H1: The distribution of the variable X is different in A and B.

Statistic

(A1, A2, … , AnA) is a sample, of size nA, selected from population A, and (B1, B2, …, BnB) is a sample of size nB selected from population B. N = nA + nB obser­vations are obtained, which are classed in ascending order regardless of the samples they are taken from. Each observation is then given a rank: 1 for the smallest value, up to N for the greatest.

The statistic calculated is:

R = the longest ‘homogenous series’ (that is, series of consecutive values belong­ing to one sample) found in the general series ranking the nA + nB observations. The statistic R is compared to critical values Ca available in a table.

Interpreting the test The decision rule is the following:

H0 will be rejected if R < Ca.

For large values of nA and nB (that is, > 20),

tends towards the standard normal distribution. R’ may then be used instead, in association with the decision rules of the standard normal distribution for rejecting or not rejecting the null hypothesis.

1.6. Comparing the distribution of a variable X in k populations: Kruskal-Wallis test or analysis of variance by ranking

Research question Is the distribution of the variable X identical in the k popu­lations A1, A2, … , Ak?

Application conditions

  • The two samples are random and contain n1, n2,… nk independent observa­tions from the populations Av A2, … Ak.
  • The variable X is at least ordinal.

Hypotheses

Null hypothesis, H0: The distribution of the variable X is identical in all k populations,

alternative hypothesis, Hp The distribution of the variable X is different in at least one of the k populations.

Statistic

(Au, A12, … , A1n1) is a sample of size n1 selected from population Ak and (A21, A22, … , A2n2) is a sample of size n2 selected from population A2.

observations are obtained, which are classed in ascending order regardless of the samples they are taken from. Each observation is then given a rank: 1 for the smallest value, up to N for the greatest. For equal observations, a mean rank is attributed. Rt is the sum of the ranks attributed to observations of sample A.

The statistic calculated is:

If a number of equal values are found, a corrected value is used:

where g is the number of groups of equal values, and ti the size of group i. Interpreting the test The decision rule is the following:

H0 will be rejected if H (or, the case being, H’) >%2 _a; k _ 1 or a corresponding value in the Kruskal-Wallis table.

If the Kruskal-Wallis test leads to the rejection of the null hypothesis, it is possible to identify which pairs of populations tend to be different. This requires the application of either the Wilcoxon signed test or the sign test if the samples are matched (that is, logically linked, such that the pairs or n-uplets of observations in the different samples contain identical or similar individuals), or a Mann-Whitney U or Wilcoxon test if the samples are not matched. This is basically the same logic as used when associating the analysis of variance and the LSD test. The Mann-Whitney U is sometimes called analysis of variance by rank.

1.7. Comparing two proportions or percentages (small samples)

Research question Do the two proportions or percentages p1 and p2, observed in two samples, differ significantly from each other?

Application conditions

  • The two samples are random and contain n1 and n2 independent observa­tions respectively.
  • The size of the samples is small (n1 < 30 and n2 < 30).
  • The two proportions p1 and p2 as well as their complements 1-p1 and 1-p2 represent at least 5 observations.

Hypotheses

Null hypothesis, H0: π1 = π2, alternative hypothesis, H1: π1 # π2

Statistic

The statistic calculated is

where x1 is the number of observations in sample 1 (size n1) corresponding to the proportion p1, X2 is the number of observations in sample 2 (size n2) corre­sponding to the proportion p2, and

This statistic % follows a chi-square distribution with 1 degree of freedom.

Interpreting the test The decision rule is the following:

H0 will be rejected if x > x2a1

2. Tests on More than One Variable in One or Several Matched Samples

Two or more samples are called matched when they are linked in a logical man­ner, and the pairs or n-uplets of observations of different samples contain iden­tical or very similar individuals. For instance, samples comprised of the same people observed at different moments in time may comprise as many matched samples as observation points in time. Similarly, a sample of n individuals and another containing their n twins (or sisters, brothers, children, etc.) may be made up of matched samples in the context of a genetic study.

2.1. Comparing any two variables: independence or homogeneity test

Research question Are the two variables X and Y independent?

Application conditions

  • The sample is random and contains n independent observations.
  • The observed variables X and Y may be of any type (nominal, ordinal, inter­val, ratio) and are illustrated by kX and kY

Hypotheses

Null hypothesis, H0: X and Y are independent, alternative hypothesis, H1: X and Y are dependent

Statistic

The statistic calculated is

where ni, designates the number of observations presenting both characteristics Xt and Y, (i varying from 1 to kX; j from 1 to kY),

is the number of observations presenting the characteristics Xi and

is the number of observations presenting the characteristics Xj.

This statistic % follows a chi-square distribution with (kX – 1) (kY – 1) degrees of freedom.

Interpreting the test The decision rule is the following:

H0 WiH be rejected if x > x2a.(kX _ 1) (y _ 1)

2.2. Comparing two variables X and Y measured from two matched samples A and B: sign test

Research question Are the two variables X and Y, measurable for two matched samples A and B, identically distributed?

Application conditions

  • The two samples are random and matched.
  • The n pairs of observations are independent.
  • The variables X and Y are at least ordinal.

Hypotheses

Null hypothesis, H0: The distribution of the two variables is identical in the two matched samples,

alternative hypothesis, H1: The distribution of the two variables is different in the two matched samples.

Statistic

The pairs (a1, b1), (a2, b2), … , (an, bn) are n pairs of observations, of which the first element is taken from population A and the second from population B. The difference at – bt is calculated for each of these n pairs of observations (a, b). The number of positive differences is represented by k+, and the number of nega­tive differences by k-. The statistic calculated is:

K = Minimum (k+, k-).

The statistic K is compared to critical values Ca available in a table.

Interpreting the Test The decision rule is the following:

H0 will be rejected if K < Ca.

For sufficiently large values of n (that is, n > 40),

tends towards the standard normal distribution. K’ may then be used instead, in association with the decision rules of the standard normal distribution for rejecting or not rejecting the null hypothesis.

2.3. Comparing variables from matched samples: Wilcoxon sign test

Research question Are the two variables X and Y, measured from two matched samples A and B, identically distributed?

Application conditions

  • The two samples are random and matched.
  • The n pairs of observations are independent.
  • The variables X and Y are at least ordinal.

Hypotheses

Null hypothesis, H0: The distribution of the two variables is identical in the two matched samples,

alternative hypothesis, Hp The distribution of the two variables is different in the two matched samples.

Statistic

(a1, bj), (a2, b2), … , (an, bn) are n pairs of observations, of which the first element is taken from the population A and the second from the population B. For each pair of observations, the difference dt = at – bt is calculated. In this way n dif­ferences di are obtained, and these are sorted in ascending order. They are ranked from 1, for the smallest value, to n, for the greatest. For equal-value rankings, a mean rank is attributed. Let R+ be the sum of the positive differ­ences and R— the sum of the negative ones.

The statistic calculated is:

R = Minimum (R+, R-)

The statistic R is compared to critical values Ra available in a table.

Interpreting the test The decision rule is the following: H0 will be rejected if R < Ra.

For large values of n (that is, n > 20),

tends towards a standard normal distribution. R’ may then be used in associa­tion with the decision rules of the standard normal distribution for rejecting or not rejecting the null hypothesis.

2.4. Comparing variables from matched samples: Kendall rank correlation test

Research question Are the two variables, X and Y, measured for two matched samples A and B, independent?

Application conditions

  • The two samples are random and matched.
  • The n pairs of observations are independent.
  • The variables X and Y are at least ordinal.

Hypotheses

Null hypothesis, H0: X and Y are independent, alternative hypothesis, H1: X and Y are dependent.

Statistic

The two variables (X, Y) observed in a sample of size n give n pairs of obser­vations (X1, Y1), (X2, Y2), … , (Xn, Yn). An indication of the correlation between
variables X and Y can be obtained by sorting the values of Xi in ascending order and counting the number of corresponding Yi values that do not respect this order. Sorting the values in ascending order guarantees that Xi < Xj for any value of i < j.

Let R be the number of pairs (Xi, Y.) such that, if i < j, Xi < Xj and Yi < Y.

The statistic calculated is:

The statistic S is then compared to critical values Sa available in a table. Interpreting the test The decision rule is the following:

H0 will be rejected if S > Sa. In case of rejection of H0, the sign of S indicates the direction of the dependency.

For large values of n (that is, n > 15),

tends towards a standard normal distribution. S’ may then be used instead, in association with the decision rules of the standard normal distribution for rejecting or not rejecting the null hypothesis.

2.5. Comparing two variables X and Y measured from two matched samples A and B: Spearman rank correlation test

Research question Are the two variables X and Y, measured for two matched samples A and B, independent?

Application conditions

  • The two samples are random and of the same size, n.
  • The observations in each of the samples are independent.
  • The variables X and Y are at least ordinal.

Hypotheses

Null hypothesis, H0: X and Y are independent, alternative hypothesis, H1: X and Y are dependent.

Statistic

The two variables (X, Y) observed in a sample of size n give n pairs of obser­vations (X1, Y1), (X2, Y2) … (Xn, Yn). The values of Xi and Yj can be classed sep­arately in ascending order. Each of the values Xi and Yj is then attributed a rank
between 1 and n. Let R(Xt) be the rank of the value Xi, R(Yt) the rank of the value Y and di = R(X) – R(Y).

The Spearman rank correlation coefficient is:

Interpreting the test The Spearman rank correlation coefficient R can be eval­uated in the same way as a classical correlation coefficient (see Sections 4.1 through 4.3 in main Section 2).

2.6. Comparing classifications

Research question Are k classifications of n elements identical?

This type of test can be useful, for instance, when comparing classifications made by k different experts, or according to k different criteria or procedures.

Application conditions

  • A set of n elements E1, E2, …, En has been classified by k different procedures

Hypotheses

Null hypothesis, H0: the k classifications are identical, alternative hypothesis, H1: At least one of the k classifications is different from the others

Statistic

The statistic calculated is:

where rij is the rank attributed to the element Ei by the procedure j (expert opin­ion, criteria, method …).

The statistic X is compared to critical values Xa available in a table.

Interpreting the test The decision rule is the following:

H0 will be rejected if S > Xa.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Specifying the Phenomenon or System to be Modeled in the Research

Models can be defined as abstract representations of real phenomena. They represent the components of the phenomena studied as much as they do the interrelationships among these components. Identifying the phenomenon or system to be modeled is a three-step process. The first step is to determine its components. The interrelationships between these components must then be specified. Finally, because these models are representations, the researcher will want to formalize them through a graphic or mathematical description of the components and their presumed interrelations. Most often, a model will be rep­resented by circles, rectangles or squares linked by arrows, curves, lines, etc.

1. Components of a Model

A model, in its most simple form (a relationship of cause and effect between two variables), essentially comprises two types of variables with different func­tions: independent variables (also called explanatory or exogenous variables) and dependent variables (also called variables to be explained, or endogenous variables). In this causal relationship, the independent variable represents the cause and its effect is measured on the dependent variable.

Quite often, the phenomenon being studied includes more elements than simply an independent variable and a dependent variable. A single dependent variable is liable to have multiple causes. Several independent variables may explain one dependent variable. These variables are considered as causes in the same way as the initial causal variable, and produce an additive effect. They are an addition to the model and explain the dependent variable.

The introduction of new variables within a simple causal relationship between two independent and dependent variables may also produce an inter­active effect. In this case, the dependent variable is influenced by two causal variables whose effect can only be seen if these two associated variables intervene at the same time.

Figure 12.1 schematizes the additive and interactive effects linked to the introduction of a third variable, Z, into a simple causal relationship between two variables, X and Y.

Finally, variables can intervene in the direct causal relationship between the dependent variable or variables and the independent variable. These variables, described as intervenors, take two forms: mediator or moderator.

When the effect of the independent variable X on the dependent variable Y is measured by the intermediary of a third variable Z, we call this third variable a mediator. The association or causality observed between X and Y results from the fact that X influences Z, which in turn influences Y (Baron and Kenny, 1986).

A moderator variable modifies the intensity (increasing or decreasing) and/or the sign of the relationship between independent and dependent vari­ables (Sharma et al., 1981). The moderator variable enables researchers to iden­tify when certain effects are produced and then to break down a population into subpopulations, according to whether the measured effect is present or not. Figure 12.2 schematizes the mediator and moderator effects relative to the introduction of a third variable in the relationship between independent and dependent variables.

To move from a phenomenon to a model, it is not enough just to identify the different variables involved. One must also determine the status of these vari­ables: dependent, independent (with an additive or interactive effect), modera­tor or mediator.

2. Relationships

Three types of relationship are possible between two of a phenomenon’s vari­ables (Davis, 1985). The first two are causal, while the third involves simple association:

  • The first possibility is a simple causal relationship between the two vari­ables: X => Y (X influences Y, but Y does not influence X).

Example: A simple causal relationship

Introducing the theory of the ecology of populations to explain organizational iner­tia and organizational change, Hannan and Freeman (1984) postulate that selection mechanisms favor companies whose structures are inert. They present organiza­tional inertia as the result of natural selection mechanisms. The variable ‘natural selection’ therefore acts on the variable ‘organizational inertia’.

  • The second relationship involves a reciprocal influence between two vari­ables: X => Y => X (X influences Y, which in turn influences X).

Example: A reciprocal causal relationship

Many authors refer to the level of an organization’s performance as a factor that pro­vokes change. They consider that weak performance tends to push organizations into engaging in strategic change. However, others point out that it is important to take the reciprocal effect into account. According to Romanelli and Tushman (1986), organizations improve their performance because they know how to make timely changes in their action plan. There is, therefore, a reciprocal influence between per­formance and change, which can be translated as follows: weak performance pushes companies to change and such change enables performance to be improved. [1]

  • The third relationship demonstrates that an association exists between thetwo variables. However, it is not possible to determine  which causes the other: X <=> Y (X relates to Y and Y to X).

Once the nature of the relationship has been determined, it is important to establish its sign. This is either:

  • positive, with X and Y varying in the same direction
  • or negative, with X and Y varying in opposite directions.

In a causal relationship, the sign translates as follows. It is positive when an increase (or reduction) in X leads to an increase (or decrease) in Y and it is nega­tive when an increase (or decrease) in X leads to a decrease (or increase) in Y. In the case of an association between two variables, X and Y, the sign of the rela­tionship is positive when X and Y are high or low at the same time and nega­tive when X is high and Y is low and vice versa.

In a relationship between two variables, causality is not always immediate. Indeed, there can be a latent effect – a period during which the effect of a variable is awaited. This effect is particularly significant when researchers use correla­tion to test relationships between two temporal sets. They need to determine what period of time elapses between the cause and effect, and to separate the two series accordingly to conduct this test. It is possible that the relationship or causality will only be effective below a certain level. As long as the value of X remains below this level, X has no influence on Y (or there is no relationship between X and Y). When the value of X goes beyond this level, then X has an influence on Y (or there is a relationship between the two variables X and Y). An effect related to level may also influence the relationship’s sign. For example, below a certain level, the relationship is positive. Once this level has been attained, the sign is inverted and becomes negative. Figure 12.3 schematizes these different effects.

Whatever the nature or sign of the relationship, causality can only be speci­fied in two ways: deductively, starting from theory, or inductively, by observa­tion. Researchers must begin with the hypothesis of a causal relationship between two variables, either based on theory or because observation is tend­ing to reveal it. Only then should they evaluate and test the relationship, either quantitatively or qualitatively.

3. Formal Representation

Models are usually represented in the form of diagrams. This formalization responds, above all, to the need to communicate information. A drawing can be much clearer and easier to understand than a lengthy verbal or written descrip­tion, not least when the model is complex (that is, contains several variables and interrelations).

Over the years, a diagrammatic convention has developed from ‘path analy­sis’ or ‘path modeling’, (see Section 3). According to this convention, concepts or variables that cannot be directly observed (also called latent variables, concepts or constructs) are represented by circles or ellipses. Variables that can be directly observed (manifest and observed variables, and measurement variables or indi­cators) are represented by squares or rectangles. Causal relationships are indica­ted by tipped arrows, with the arrowhead indicating the direction of causality. Reciprocal causality between two variables or concepts is indicated by two arrows going in opposite directions. Simple associations (correlations or covariance) between variables or concepts are indicated by curves without arrowheads, or two arrowheads going in opposite directions at the two ends of the same curve. A curve that turns back on one variable or concept indicates variance (covariance of an element with itself). Arrows without origin indicate errors or residues.

Figure 12.4 is an example of a formal representation of a model examining the relationship between strategy and performance. In this example, ‘product lines’ or ‘profit margin’ are directly observable variables, and ‘scale’ or ‘profitability’ are concepts (that is, variables that are not directly observable). The hypothesis is that three causal relationships exist between the concepts of ‘segments’, ‘resources’ and ‘scale’, on the one hand, and that of ‘profitability’ on the other. There is also assumed to be an association relationship between the three concepts ‘segments’, ‘resources’ and ‘scale’. Finally, all the directly observable variables contain terms of error as well as the concept of ‘profitability’.

There is scope for the notion of a model to be very widely accepted in quan­titative research. Many statistical methods aim to measure causal relationships between variables by using a model to indicate the relationship system. This is often expressed by equations predicting the dependent or explained variables that are to be explained by other variables (known as independent or explana­tory variables). An example is linear regression or analysis of variance.

These explanatory methods are particular examples of more general tech­niques through which causal relationship networks can be examined (Hoyle, 1995). These techniques are known by various names; such as ‘path analysis’ or ‘path modeling’, which we have already mentioned, ‘causal model analysis’ or ‘causal modeling’, ‘structural equations analysis’ or ‘structural equation model­ing’ and ‘latent variable analysis of structural equations’. Some of them even just carry the name of a computer program, such as LISREL, PLS or AMOS. In this chapter, we have decided to use the term ‘causal model’ to indicate these techniques for examining causal relationship networks.

Causal models do not necessarily have to be represented graphically. They can also be represented mathematically. As a rule, causal models are expressed in the form of equations. These are often matrix equations, in which case the general notation takes the following form:

r = the matrix of causal relationships between the latent exogenous and the endogenous variables

β = the matrix of causal relationships between the latent endogenous variables.

Causal models are interesting not only because they produced the conven­tion for formally representing phenomena or modeling systems. They are also the most successful technique for modeling causal relationships quantitatively, illustrating the quantitative process that takes place at each stage.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Specifying Variables and Concepts of the Research

Specifying a model’s different concepts and variables is, above all, dependent on the researcher’s chosen approach, which can be inductive or deductive, qualita­tive or quantitative. There are two levels of specification. One is conceptual, and enables researchers to determine the nature of concepts. The other is opera­tional, and enables researchers to move from concepts to the variables that result from field observation. An inductive approach specifies conceptual and opera­tional levels simultaneously, whereas a deductive approach moves from the con­ceptual level to the operational level. But even with a deductive approach, the researcher may need to return to both of these two levels at a later stage.

1. Qualitative Method

A particular characteristic of qualitative methods is that they do not necessitate numerical evaluation of the model’s variables. Specification therefore involves describing the model’s concepts without quantifying them. We can, however, evaluate concepts according to a number of dimensions, representing the vari­ous ‘forms’ they can take. There is no reason why researchers should not use quantitative data to specify certain of their model’s concepts.

1.1. Qualitative inductive method

When researchers decide to employ a qualitative/inductive method, field data is used to draw out the concepts that represent the phenomenon being studied. Glaser and Strauss (1967) propose an inductive method of coding, which they call ‘open coding’. This method enables ‘the process of breaking down, examin­ing, comparing, conceptualizing, and categorizing data’ (Strauss and Corbin, 1990: 61) to occur. It has four interactive phases:

Phase 1: Labeling phenomena This phase involves taking an observation – spoken or written – and giving a name to each incident, idea, or event it contains. To facilitate this first ‘conceptualization’ (passage from data to concept) researc­hers can ask themselves the following questions: what is it? and what does that represent?

Phase 2: Discovering categories This phase involves grouping together the concepts resulting from the first phase, to reduce their number. For this catego­rization, researchers can group together those concepts that are most closely related to each other. Alternatively, they can group together their observations, while keeping the concepts in mind.

Phase 3: Naming a category Researchers can invent this name themselves or borrow it from the literature. It can also be based on words or phrases used by interviewees.

Phase 4: Developing categories in terms of their properties and dimensions Now the researcher defines the ‘properties’ and ‘dimensions’ of each of the categories created during the previous phases. Properties are a category’s characteristics or attributes, while dimensions represent the localization of each property on a continuum. Dimensions translate the different forms the property can take (a phenomenon’s intensity, for example). They enable researchers to construct dif­ferent profiles of a phenomenon, and to represent its specific properties under different conditions.

Miles and Huberman (1984a) propose a process which, while remaining inductive, is based on a conceptual framework and gives researchers a focus for collecting data in the field. They suggest a number of tactics for drawing out these concepts, which we will explain briefly. The aim of the first tactic is to cal­culate, or isolate items that recur during interviews or observations. This tactic aims at isolating the model’s concepts (which the authors also call themes). The second tactic involves grouping the elements in one or several dimensions together to create categories. This can be done by association (grouping together similar elements) or dissociation (separating dissimilar elements). The third tactic is to subdivide the categories created earlier; to explore whether a designated category might actually correspond to two or more categories. Researchers need to be cautious about wanting to subdivide each category, and so guard against excessive atomization. The fourth tactic is to relate the particular to the general – to ask: of what is this element an example? and does it belong to a larger class?

The fifth and final tactic involves factorizing. The term ‘factor’ stems from factor analysis, a statistical tool used to reduce a large number of observed vari­ables to a small number of concepts that are not directly observed. Factorization occurs in several stages. First, researchers take an inventory of items arising during their interviews or observations. They then group the items according to a logical rule they have defined in advance. The rule could be that items arising concomitantly during the interviews should be grouped together. Alternatively, one could group together items translating a particular event. At the end of this phase, researchers have several lists of items at their disposal. They then describe the various items so as to produce a smaller list of code names. They group these code names together under a common factor, which they then describe.

The two methods explained above enable researchers to draw out the model’s variables, and then its concepts, from observations in the field: to spe­cify the model’s components. These methods are inductive, but can still be used in a theoretical framework. In both cases, researchers are advised to loop back continually between fieldwork data and relevant literature during the coding process. In this way, they should be able to specify and formalize the variables (or concepts) they have defined.

1.2. The qualitative deductive method

With this method researchers draw up a list of the concepts that make up the phenomenon being studied, using information gleaned from the results of earlier research. They then operationalize these concepts using data from the empirical study so as to obtain variables. However, researchers adopting this method are advised to enrich and remodel the concepts obtained from the lit­erature using data gathered in the field (Miles and Huberman, 1984a). To do this, they can turn to techniques for specifying variables that are appropriate for use in inductive research.

Example: Specifying a model’s variables using a qualitative deductive method

The principle of this method is to start with a list of codes or concepts stemming from a conceptual framework, research questions or initial hypotheses (or research propositions). These concepts are then operationalized into directly observed vari­ables. In their study on teaching reforms, Miles and Huberman (1984a) initially conceptualized the process of innovation ‘as a reciprocal transformation of the innovation itself, of those making use of it, and the host classroom or school’ (Miles and Huberman, 1984a: 98). They drew up a list of seven general codes (or concepts): property of innovation (PI), external context (EC), internal context (IC), adoption process (AP), site dynamic and transformation (SDT), and new configurations and final results (NCR). This list breaks down into subcodes (or variables), which can be directly observed in the field (operationalization of concepts). For example the code internal context breaks down into: characteristics of the internal context (CI-CAR), norms and authority (CI-NORM) and history of the innovation (CI-HIST), etc.

2. The Quantitative Method

Quantitative causal modeling techniques place the identifying variables and concepts center-stage. They have systematized the theoretical distinction between variables and concepts.

Usually, causal models contain variables that are not directly observable (known as latent variables, concepts or constructs) and directly observable vari­ables (known as manifest or observed variables, indicators, or variables of measurement). The notion of a latent variable is central in human and social sciences. Concepts such as intelligence, attitude or personality are latent vari­ables. Manifest variables are approximate measurements of latent variables. A score in an IQ test can be considered as a manifest variable that is an approxima­tion of the latent variable ‘intelligence’. In causal modeling, it is recommended that each latent variable should be measured by several manifest variables. The latent variable is defined by what happens within the community of diverse manifest variables that are supposed to measure it (Hoyle, 1995). From this point of view, latent variables correspond to the common factors we recognize in factor analysis. They can, as a result, be considered as devoid of measurement errors.

2.1. The quantitative deductive method

In specifying concepts, there are several possibilities. The model’s concepts may already be precisely defined. In strategy, for example, the concept of a strategic group univocally indicates a group of firms in a given sector that have the same strategy. Researchers using such a concept in their model will not be spending time redefining it. Johansson and Yip (1994) provide other examples of identifying the concepts of industry structure, global strategy, organization structure, etc. Already defined methods of operationalizing concepts may even be available. This is true in the case of the aforementioned strategic groups. Strategic groups can be operationalized through a cluster analysis of companies, characterized by variables that measure strategic positioning choices and resource allocation. If a method of operationalizing concepts is already avail­able, the researcher’s main preoccupation will be to verify its validity.

That said, even when the concepts are defined and the operationalization method has been determined, researchers are still advised systematically to try and enrich and remodel the variables/concepts stemming from earlier works by means of observation or theory.

Researchers need to clearly define their concepts and clearly formulate their method of operationalization. They can opt for either an inductive, qualitative or quantitative method to specify the concepts.

2.2. Quantitative inductive method

While quantitative methods are more readily associated with deductive research, they can very well be called into use for inductive research. It is altogether possible, when specifying a model’s variables and concepts, to use statistical methods to draw out these concepts from the available data. This practice is very common in what is known as ‘French’ data analysis, which was popular­ized by Jean-Paul Benzecri’s team (see Benzecri, 1980; Lebart et al., 1984). In general, it involves using a table of empirical data to extract structures, classes and regularities.

Data analysis methods such as correspondence analysis, factor analysis or cluster analysis (classification analysis) are fruitful and very simple methods for drawing out concepts from empirical data. For example, several strategic management researchers have been able to use factor analysis to identify ‘generic strategies’ that can be classed as factors stemming from data. The work of Dess and Davis (1984) illustrates this process. Equally, many other researchers have turned to cluster analysis to identify ‘strategic groups’ – classes stemming from classification analyses. Thomas and Venkatraman (1988) present many such research works.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Identifying Relationships Among a Model’s Variables and Concepts in the Research

As with identifying the model’s concepts and variables, identifying relationships depends above all on the method the researcher adopts: qualitative or quantita­tive, inductive or deductive. Its aim is to determine whether there is a relation­ship between the model’s concepts (and variables), the nature of this relationship (causal or simple association) and the relationship’s sign (positive or negative).

1. Qualitative Method

Specifying qualitative relationships involves determining the elements that characterize the relationship. It is not a case of evaluating the relationship mathematically or statistically. However, nothing prevents the researcher from using a coding procedure to quantify the data before evaluating the relation­ship quantitatively.

1.1. Qualitative inductive method

In the case of inductive methods, Glaser and Strauss (1967) propose an ‘axial coding’ technique comprising a set of procedures by which the data is grouped together to create links between categories (Strauss and Corbin, 1990). The aim of axial coding is to specify a category (which the authors also call a phenome­non) according to the following categories:

  • Causal conditions. These conditions, which the authors call ‘causal condi­tions’ or antecedent conditions, are identified with the help of the following questions: Why? When? How? Until when? There may be several causal conditions for one phenomenon.
  • The context of all the phenomenon’s properties: its geographical and tem­poral position, etc. The researcher identifies the context by posing the fol­lowing questions: When? For how long? With what intensity? According to which localization?, etc.
  • Action/interaction strategies engaged to drive the phenomenon.
  • Intervening conditions, represented by the structural context, which facili­tate or restrict actions and interactions. These include time, space, culture, economic status, technical status, careers, history, etc.
  • Consequences linked to these strategies. These take the form of events, and active responses to initial strategies. They are current or potential and can become causal conditions of other phenomena.

The phenomenon (or category) corresponds to the model’s central idea. It is revealed by the following questions: To what does the data refer? What is the aim of the actions and interactions?

1.2. Qualitative deductive method

As when specifying variables or concepts, researchers can use a qualitative/ deductive method to establish relationships between variables using the results of earlier research (available literature). Relationships established in this way can also be supplemented by other relationships stemming from initial observations in the field. So, before testing a model that was constructed a priori, researchers are advised to conduct several interviews, or collect infor­mation that will enable them to demonstrate other relationships than those stemming from the literature. The next stage involves operationalizing these relationships.

Example: Identifying relationships using a qualitative deductive method

In researching the relationship between sensemaking at a strategic level and organi­zational performance, Thomas et al. (1993) studied public hospitals in one US state and sought to relate the search for information to how this information is inter­preted, the action then taken, and the results. Through an analysis of the literature, Thomas et al. were able to describe the relationships between their model’s different elements (see Figure 12.5).

Information scanning was operationalized according to two variables: ‘information use’ and ‘information source’ (which could be internal or external). The concept of interpretation incorporated the two variables ‘positive gain’ and ‘controllability’. Strategic change was measured at the level of ‘product – service change’ over the period 1987 to 1989. Performance was measured using three variables: ‘occupancy’, ‘profit per discharge’ and ‘admissions’.

2. Quantitative Methods

Causal models provide a good example of the quantitative method of specify­ing causal relationships within a model. While quantitative methods are gen­erally associated with a deductive approach, we will see that they can also be used in inductive research.

2.1. Quantitative deductive method

We can distinguish between two situations researchers may encounter in speci­fying the relationships between a model’s variables/concepts. When consult­ing the available literature, they may find specific hypotheses that clearly detail the nature and sign of the relationships between the variables/concepts. In this case, their main preoccupation will be to verify the validity of these hypothe­ses. The problem then essentially becomes one of testing the hypotheses or causal model. This question of testing causal models is dealt with in the fourth and final section of this chapter.

However, very often, researchers do not have a set of hypotheses or pro­positions prepared in advance about the relationships between the model’s concepts and variables. They then have to proceed to a full causal analysis. Although any of the qualitative techniques presented in the first part of this might be used, quantitative methods (that is, causal models) are more profitable.

Causal models can be defined as the union of two conceptually different models:

  • A measurement model relating the latent variables to their measurement indicators (that is, manifest or observed variables).
  • A model of structural equations translating a group of cause-and-effect rela­tionships between latent variables or observed variables that do not repre­sent latent variables.

Relationships between latent variables and their measurement indicators are called epistemic. There are three types: non-directional, reflective and for­mative. Non-directional relationships are simple associations. They do not rep­resent a causal relationship but a covariance (or a correlation when variables are standardized). In reflective relationships, measurement indicators (manifest variables) reflect the underlying latent variable (that is, the latent variable is the cause of the manifest variables). In formative relationships, measurement indicators ‘form’ the latent variable (that is, they are the cause). The latent variable is entirely determined by the linear combination of its indicators. It can be difficult to determine whether a relationship is reflective or formative. For example, intelligence is a latent variable linked by reflective relationships to its measurement indicators, such as IQ. (Intelligence is the cause of the observed IQ.) However, the relationships between the latent variable, socio-economic status and measurement indicators such as the income or level of education are, by nature, formative (income and level of education effect economic status).

In the example in Figure 12.4, the ‘measurement model’ relates to measure­ment of the four latent variables ‘scope’, ‘resources’, ‘scale’ and ‘profitability’. The epistemic relationships are all reflective. Measurement models are analo­gous to factor analysis (on the manifest variables). Structural models look at causal relationships between latent variables, and are analogous to a series of linear regressions between latent variables (here, the relationships are between ‘scope’, ‘resources’ and ‘scale’, on the one hand, and ‘profitability’, on the other). Causal models can be presented as a combination of factor analysis (on the manifest variables) and linear regressions on the factors (latent variables). We can see how causal models are a generalization of factor and regression analyses.

Researchers who choose a quantitative process to specify relationships must systematically distinguish between the different kinds of relationship between their model’s variables (association, simple causality and reciprocal causality). In the language of causal models, association relationships are also called non­directional relationships, and represent covariance (or correlations when the variables are standardized). Simple causal relationships are known as uni­directional, while reciprocal causal relationships are known as bi-directional. On a very general level, all relationships can be broken down into two effects: causal and non-causal (association). Causal effects, comprise two different effects: direct and indirect. A direct effect represents a direct causal relationship between an independent variable and a dependent variable. However, in causal models, one particular variable can, at the same time, be dependent on one direct effect and independent of another. This possibility for a variable to be both independent and dependent in one model goes to the heart of the notion of indirect effect.

Indirect effect is the effect of an independent variable on a dependent vari­able via one or several mediator variables. The sum of the direct and indirect effects constitutes the total effect. On-causal effects (association) also break down into two. First, there are association effects due to a common identified cause (that is, one or several variables within the model constitute the common cause of the two associated variables). Then, there are the non-analyzed associ­ation effects (that is, for various reasons, the researcher considers that the vari­ables are associated). Researchers may do this when, for two related variables, they are unable to differentiate between cause and effect, or when they know that the two variables have one or several causes in common outside of the model. In causal models, non-analyzed associations translate as covariance (or correlations) and are represented by curves that may have arrow heads at each end.

A model is said to be recursive if it has no bi-directional causal effect (that is, no causal relationship that is directly or indirectly reciprocal). While the terms can seem misleading, it should be noted that recursive models are unidirectional, and non-recursive models bi-directional. Recursive models occupy an important position in the history of causal models. One of the most well-known members of this family of methods, path analysis, uses only with recursive models. The other major characteristic of path analysis is that it only considers manifest vari­ables. Path analysis is a case in point among causal models (Maruyama, 1998). Figure 12.6 presents an example of a path analysis model.

Identifying causal relationships in the framework of a quantitative approach can be more precise than just specifying the nature of these relationships (asso­ciation, unidirectional or bi-directional). It is also possible to fix the sign of the relationships and even their intensity. Equality or inequality constraints may be taken into account. For example, the researcher may decide that one particular relationship is equal to a given fixed value (0.50, for instance), that another should be negative, that a third will be equal to a fourth, which will be equal to double a fifth, which will be less than a sixth, etc. Intentionally extreme, this example illustrates the great flexibility researchers have when they quantitatively specify the relationships between variables and concepts in a causal model.

2.2. Quantitative inductive methods

Quantitative methods can be used inductively to expose causal relation­ships between variables or concepts. By analyzing a simple matrix of correla­tions between variables, researchers can draw out possible causal relationships (between pairs of variables that are strongly correlated). It is also possible to make exploratory use of ‘explanatory’ statistical methods (for example, linear regression or analysis of variance) to identify statistically significant ‘causal’ relationships between different variables. However, explanatory methods are a case in point in terms of causal methods, and we prefer here to discuss the subject more generally.

Joreskog (1993) distinguishes between three causal modeling situations: confirmation, comparison and model generation. In the strictly confirmatory situation, researchers build a model which they then test on empirical data. When the test results lead to the model being rejected or retained, no other action is taken. It is very rare for researchers to follow such a procedure. The two other situations are much more common. In the alternative models situa­tion, researchers start off with several alternative models. They evaluate each one using the same set of data and then compare them so as to retain the best ones. This is common when concurrent theories exist and when the area of interest has not yet reached a mature phase or there is uncertainty about the relationships between variables and concepts. In model generation, researchers begin with an already determined model, test it on a set of relevant data and then refine it (in particular, by eliminating the non-significant relationships and adding significant relationships omitted earlier). Aaker and Bagozzi (1979) make the following observation: while, ideally, a single model of structural equations corresponds to a given theory and researchers then calculate that model’s parameters, the situation is often different in practice. Most often, researchers begin with an initial version of the model of structural equations which they then test and improve iteratively before gradually obtaining a satis­factory version. According to Aaker and Bagozzi, this phenomenon arises because of the immaturity of the theories involved, the complexity of the management problems and the presence, at every stage of the research process, of uncertainties which translate as measurement errors.

While the strictly confirmatory situation is more in keeping with a deduc­tive process, model generation and alternative models are totally compatible with an inductive approach. But as the existence of a (statistically significant) relationship does not mean there is necessarily a causal effect, researchers must always supplement their exploratory quantitative analyses with theoretical causal analysis.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Evaluating and Testing the Research Model

Evaluating and testing a model does not simply mean testing the hypotheses or relationships between the model’s concepts or variables one after the other, but also judging its internal global coherence.

1. Qualitative Methods

In some studies, researchers want to test the existence of a causal relationship between two variables without having recourse to sophisticated quantitative methods (such as those presented in the second part of this section). In fact, situations exist in which the qualitative data (resulting from interviews or docu­ments) cannot be transformed into quantitative data or is insufficient for the use of statistical tools. In this case, researchers identify within their data the arguments that either invalidate or corroborate their initial hypothesis about the existence of a relationship between two variables. They then establish a decision-making rule for determining when they should reject or confirm their initial hypothesis. To establish such a rule, one needs to refer back to the exact nature of the researcher’s hypotheses (or propositions). Zaltman et al. (1973) identify three types of hypothe­ses that can be tested empirically: those that are purely confirmable, those that are purely refutable and those that are both refutable and confirmable.

  • Hypotheses are considered to be purely confirmable when contrary argu­ments, discovered empirically, do not allow the researcher to refute them. For example, the hypothesis that ‘there are opinion leaders in companies that change’, is purely confirmable. In fact, if researchers discover contrary argu­ments, such as companies changing without opinion leaders, this does not mean they should refute the possibility of finding such leaders in certain cases.
  • Hypotheses are considered to be purely refutable when they cannot be confirmed. A single negative instance suffices to refute them. Asymmetry therefore exists between verifiability and falsifiability). For example, the hypo­thesis ‘all companies which change call upon opinion leaders’ is purely refutable. In fact, finding just one case of a company changing without an opinion leader is sufficient to cast doubt on the initial hypothesis. Universal propositions form part of this group of hypotheses. Indeed, there are an infi­nite number of possibilities for testing such propositions empirically. It is, therefore, difficult to demonstrate that all possible situations have been thought of and empirically tested.

The testing of such hypotheses is not undertaken directly. One needs to derive sub-hypotheses from the initial hypothesis and then confirm or refute them. These sub-hypotheses specify the conditions under which the initial proposition will be tested. For example, from the hypothesis ‘all com­panies which change call upon opinion leaders’ one derives the hypothesis that ‘type A companies call upon opinion leaders’ and the auxiliary hypo­thesis that ‘type A companies have changed’. It is therefore possible, to iden­tify clearly all the companies belonging to group A and test the derived hypothesis among them. If one company in the group has not called on an opinion leader in order to implement change then the derived hypothesis is refuted and, as a result, so is the initial hypothesis. If all type A companies have called upon opinion leaders when instituting change, the derived hypothesis is then confirmed. But the researcher’s cannot conclude that the initial hypothesis is also confirmed.

  • The final group consists of hypotheses that are both confirmable and refutable. As an example, let us consider the hypothesis ‘small companies change more often than large ones’. To test such a hypothesis, it suffices to calculate the frequency with which small and large companies change, then compare these frequencies. The presence of a single contrary argument does not systematically invalidate the initial hypothesis. Conversely, as Miles and Huberman (1984a) point out, one cannot use the absence of contrary evidence as a tactic of decisive confirmation.

To move from testing a relationship to testing a model, it is not enough to juxtapose the relationships between the model’s variables. In fact, as we saw in the introduction to this section, we must ascertain the model’s global coher­ence. Within the framework of a qualitative method, and particularly when it is inductive, researchers face three sources of bias that can weaken their con­clusions (Miles and Huberman, 1984a):

  • The holistic illusion: according events more convergence and coherence than they really have by eliminating the anecdotal facts that make up our social life.
  • Elitist bias: overestimating the importance of data from sources who are clear, well informed and generally have a high status while underestimating the value of data from sources that are difficult to handle, more confused, or of lower status.
  • Over-assimilation: losing one’s own vision or ability to pull back, and being influenced by the perceptions and explanations of local sources (Miles and Huberman, 1984a).

The authors propose a group of tactics for evaluating conclusions, ranging from simple monitoring to testing the model. These tactics permit the researcher to limit the effects of earlier bias.

2. Quantitative Methods

Causal models illustrate a quantitative method of evaluating and testing a causal model. However, evaluating a model is more than simply evaluating it statistically. It also involves examining its reliability and validity. These two notions are developed in detail in Chapter 10. We have, therefore, chosen to emphasize a system that is the most rigorous method for examining causal rela­tionships, and permits the researcher to increase the global validity of causal models. This is experimentation.

2.1. Experimental methods

The experimental system remains the favored method of proving that any vari­able is the cause of another variable. Under normal conditions, researchers test­ing a causal relationship have no control over the bias that comes from having multiple causes explaining a single phenomenon or the bias that occurs in data collection. Experimentation gives researchers a data collection tool that reduces the incidence of such bias to the maximum.

Experimentation describes the system in which researchers manipulate vari­ables and observe the effects of this manipulation on other variables (Campbell and Stanley, 1966). The notions of factor, experimental variables, independent variables and cause are synonymous, as are the notions of effect, result and dependent variables. Treatments refer to different levels or modalities (or combinations of modalities) of factors or experimental variables. An experi­mental unit describes the individuals or objects that are the subject of the experimentation (agricultural plots, individuals, groups, organizations, etc.). In experimentation, it is important that each treatment is tested on more than one experimental unit. This basic principle is that of repetition.

The crucial aim of experimentation is to neutralize those sources of variation one does not wish to measure (that is, the causal relationship tested). When researchers measure any causal relationship, they risk allocating the observed result to the cause being tested when the result is, in fact, explained by other causes (that is, the ‘confounding effect’). Two tactics are available for neutraliz­ing this confounding effect (Spector, 1981). The first involves keeping constant those variables that are not being manipulated in the experiment. External effects are then directly monitored. The limitations of this approach are imme­diately obvious. It is impossible to monitor all the non-manipulated variables. As a rule, researchers will be content to monitor only the variables they con­sider important. The factors being monitored are known as secondary factors and the free factors are principal factors. The second tactic is random allocation, or randomization. This involves randomly dividing the experimental units among different treatments, in such a way that there are equivalent groups for each process. Paradoxically, on average, the groups of experimental units become equivalent not because the researcher sought to make them equal according to certain criteria (that is, variables), but because they were divided up randomly. The monitoring of external effects is, therefore, indirect. Randomization enables researchers to compare the effects of different treatments in such a way that they can discard most alternative explanations (Cook and Campbell, 1979). For example, if agricultural plots are divided up between different types of soil treatment (an old and a new type of fertilizer), then the differences in yields cannot result from differences in the amount of sunshine they receive or the soil composition because, on the basis of these two criteria, the experimental units (agricultural plots) treated with the old type of fertilizer are, on average, comparable with those treated with the new type. Randomiza­tion can be done by drawing lots, using tables of random numbers or by any other similar method.

All experimentation involves experimental units, processing, an effect and a basis for comparison (or control group) from which variations can be inferred and attributed to the process (Cook and Campbell, 1979). These different ele­ments are grouped together in the experimental design, which permits researchers to:

  • select and determine the method for allocating experimental units to the dif­ferent processes
  • select the external variables to be monitored
  • choose the processes and comparisons made as well as the timing of their observations (that is, the measurement grades).

There are two criteria researchers can use to classify their experimental pro­gram, with a possible crossover between them. They are the number of princi­pal factors and the number of secondary (or directly monitored) factors being studied in the experimentation. According to the first criterion, the researcher studies two or more principal factors and possibly their interactions. A factor analysis can either be complete (that is, all the processes are tested) or it can be fractional (that is, certain factors or treatments are monitored). According to the second criterion, total randomization occurs when there is no secondary factor (that is, no factor is monitored with a control). The experimental units are allo­cated randomly to different treatments in relation to the principal factors studied (for example, if there is one single principal factor which comprises three modalities, it constitutes three treatments, whereas if there are three principal factors with two, three and four modalities, that makes 2x3x4, or 24 treatments). When there is a secondary factor, we refer to a random bloc plan. The random bloc plan can even be complete (that is, all the treatments are tested within each bloc) or it can be incomplete. The experimental system is the same for total ran­domization, except that experimental units are divided into subgroups accord­ing to the modalities of the variable being monitored, before being allocated randomly to different treatments within each subgroup. When there are two sec­ondary factors, we refer to Latin squares. When there are three secondary fac­tors, they are Greco-Latin squares, and when there are four or more secondary factors they are hyper-Greco-Latin squares. The different systems of squares require that the number of treatments and the number of modalities, or levels of each of the secondary factors, are identical.

In the experimental systems presented earlier, experimental units were ran­domly allocated to treatments. Agricultural plots are easier to randomize than individuals, social groups or organizations. It is also easier to conduct randomi­zation in a laboratory than in the field. In the field, researchers are often guests whereas, in the laboratory, they can feel more at home and often have virtually complete control over their research system. As a result, randomization is more common in the case of objects than people, groups or organizations and is more often used in the laboratory than during fieldwork.

Management researchers, who essentially study people, groups or organi­zations and are the most often involved in fieldwork, rarely involve themselves in experimentation. In fact, in most cases, they only have partial control over their research system. In other words, they can choose the ‘when’ and the ‘to whom’ in their calculations but cannot control the spacing of the stimuli, that is, neither the ‘when’ or the ‘to whom’ in the treatments, nor their randomiza­tion, which is what makes true experimentation possible (Campbell and Stanley, 1966). This is ‘quasi-experimentation’.

Quasi-experimentation describes experimentation that involves treatments, measurable effects and experimental units, but does not use randomi­zation. Unlike experimentation, a comparison is drawn between groups of non­equivalent experimental units that differ in several ways, other than in the presence or absence of a given treatment whose effect is being tested. The main difficulty for researchers wishing to analyze the results of quasi-experimentation is trying to separate the effects resulting from the treatments from those due to the initial dissimilarity between of groups of experimental units.

Cook and Campbell (1979) identified two important arguments in favor of using the experimental process in field research. The first is the growing reti­cence among researchers to content themselves with experimental studies in a controlled context (that is, the laboratory), which often have limited theoretical and practical relevance. The second is researcher dissatisfaction with non­experimental methods when making causal inferences. Quasi-experimentation responds to these two frustrations. Or, more positively, it constitutes a middle way, a kind of convergence point for these two aspirations. From this point of view, there is bound to be large-scale development in the use of quasi­experimentation in management.

So far in this section we have been talking essentially about drawing causal inferences using variables manipulated within the framework of an almost com­pletely controlled system. In the following paragraphs, we focus on another family of methods that enable causal inferences to be made using data that is not necessarily experimental. These are causal models.

2.2. Statistical methods

There are three phases in the evaluation and testing of causal models: identifi­cation, estimation and measurement of the model’s appropriateness.

Every causal model is a system of equations in which the unknowns are the parameters to be estimated and the values are the elements of the variance/covariance matrix. Identifying the causal model involves verifying whether the system of equations which it is made of has zero, one or several solutions. In the first instance (no solution), the model is said to be under­identified and cannot be estimated. In the second case (a single solution), the model is said to be just identified and possesses zero degree of freedom. In the third case (several solutions), the model is said to be over-identified. It possesses several degrees of freedom, equal to the difference between the number of ele­ments in the matrix of the variances/covariance (or correlations) and the num­ber of parameters to be calculated. If there are p variables in the model, the matrix of the variances/covariance is counted as p(p + 1)/2 elements and the matrix of correlations p(p – 1)/2 elements. These two numbers have to be compared with those of the parameters to be calculated. However, in the case of complex models, it can be difficult to determine the exact number of parameters to be calculated. Fortunately, the computer software currently available automatically identifies the models to be tested and displays error messages when the model is under-identified.

Statistically testing a causal model only has interest and meaning when there is over-identification. Starting from the idea that the S matrix of the observed variances/covariance, which is calculated on a scale, reflects the true X matrix of the variances/covariance at the level of all of the population, one can see that, if the model’s system of equations is perfectly identified (that is, the number of degrees of freedom is null), then the C matrix reconstituted by the model will equal the S matrix. However, if the system is over-identified (that is, the number of degrees of freedom is strictly positive) then the correspondence will probably be imperfect because of the presence of errors related to the sample. In the latter case, estimation methods permit researchers to calculate parameters which will approximately reproduce the S matrix of the variances/covariance observed.

After the identification phase, the model’s parameters are estimated, most often using the criterion of least squares. A distinction can be drawn between simple methods (unweighted least squares) and iterative methods (maximum likelihood or generalized least squares, etc.). With each of these methods, the researcher has to find estimated values for the model’s parameters that permit them to minimize an F function. This function measures the difference between the observed values of the matrix of the variances/covariance and those of the matrix of variances/covariance predicted by the model. The parameters are assessed iteratively by a non-linear optimization algorithm. The F function can be written as follows:

In the method of unweighted least squares, W equals I, the matrix identity. In the method of generalized least squares, W equals S – 1, the inverse of the matrix of the observed variances/covariance. In the method of maximum like­lihood, W equals C – 1, the inverse recalculated at each iteration of the matrix of variances/covariance predicted.

After the estimation phase, the researcher has to verify the model’s appro­priateness in relation to the empirical data. The appropriateness of a model in relation to the empirical data used to test it is greater when the gap between the matrixes of predicted and observed variances and covariance (S) is weak. However, the more parameters the model has to be estimated, the greater the chance that the gap will be reduced. For this reason, evaluation of the model must focus as much on the predictive quality of the variance/covariance matrix as on the statistical significance of each of the model’s elements.

Causal models offer a large number of criteria to evaluate the degree to which a theoretical model is appropriate in relation to empirical data. At a very general level, we can distinguish between two ways of measuring this appropriateness:

  • the appropriateness of the model can be measured as a whole (the Khi2 test, for example)
  • the significance of the model’s different parameters can be measured (for example, the t test or z test).

Software proposing iterative estimation methods, such as generalized least squares or maximum likelihood, usually provide a Khi2 test. This test compares the null hypothesis with the alternative. The model is considered to be accept­able if the null hypothesis is not rejected (in general, p > 0.05). This runs con­trary to the classic situation in which models are considered as acceptable when the null hypothesis is rejected. As a result, Type II errors (that is, the probability of not rejecting the null hypothesis while knowing it is false) are critical in the evaluation of causal models. Unfortunately, the probability of Type II errors is unknown.

In order to offset this disadvantage, one can adopt a comparative rather than an absolute approach, and sequentially test a number of models whose differ­ences are established by the addition or elimination of constraints (that is, ‘nested models’). In fact, if researchers have two models, one of which has added constraints, they can test the restricted model versus the more general model by assessing them separately. If the restricted model is correct, then the difference between the Khi2 of the two models approximately follows a Khi2 distribution and the number of degrees of freedom is the difference in degrees of freedom between the two models.

It is, however, worth noting two major limitations of the Khi2 test:

  • When the sample is very big, even very slight differences between the model and the data can lead to a rejection of the null hypothesis.
  • This test is very sensitive to possible discrepancies from a normal distribution.

As a result, other indices have been proposed to supplement the Khi2 test. The main software packages, such as LISREL, EQS, AMOS or SAS, each offer more than a dozen. Certain of these indices integrate explained variance percentage allowances. Usually, the models considered to be good models are those whose indices are above 0.90. However, the distribution of these indices is unknown and one should therefore exclude any idea of testing appropriate­ness statistically. A second category groups together a set of indices which take real values and which are very useful for comparing models that have different numbers of parameters. It is usual, with these indices, to retain as the best those models whose indices have the lowest values.

In addition to these multiple indices for globally evaluating models, numer­ous criteria exist to measure the significance of models’ different parameters. The most widely used criterion is that of ‘t’ (that is, the relation between the parameter’s value and its standard deviation). This determines whether the parameter is significantly non-null. Likewise, the presence of acknowledged statistical anomalies such as negative variances and/or determination coeffi­cients that are negative or higher than the unit are naturally clear proof of a model’s deficiency.

All in all, to meet strict requirements, a good model must present a satisfac­tory global explanatory value, contain only significant parameters and present no statistical anomalies.

In general, the use of causal models has come to be identified with the com­puter program LISREL, launched by Joreskog and Sorbom (1982). As renowned as this software might be, we shouldn’t forget that there are other methods of estimating causality which may be better suited to certain cases. For example, the PLS method, launched by Wold (1982), does not require most of the restric­tive hypotheses associated with use of maximum likelihood technique generally employed by LISREL (that is, a large number of observations and multi­normality in the distribution of variables). Variants of the LISREL program, such as CALIS (SAS Institute, 1989) or AMOS (Arbuckle, 1997), and other programs such as EQS (Bentler, 1989) are now available in the SAS or SPSS software pack­ages. In fact, the world of software for estimating causal models is evolving constantly, with new arrivals, disappearance and, above all, numerous changes.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Classification and Structuring Methods in Data Analysis

Data analysis manuals (Aldenderfer and Blashfield, 1984; Everitt, 1993; Hair et al., 1992; Kim and Mueller, 1978; Lebart et al., 1984) provide a detailed pre­sentation of the mathematical logic on which classification and structuring methods are based. We have chosen here to define these methods and their objectives, and consider the preliminary questions that would confront a researcher wishing to use them.

1. Definitions and Objectives

Classifying, condensing, categorizing, regrouping, organizing, structuring, summarizing, synthesizing and simplifying are just some of the procedures that can be done with a data set using classification and structuring methods. Taking this list as a starting point, we can formulate three propositions. First, the different methods of classification and structuring are aimed at condensing a relatively large set of data to make it more intelligible. Second, classifying data is a way of structuring it (that is, if not actually highlighting an inherent structure within the data, at least presenting it in a new form). Finally, struc­turing data (that is, highlighting key or general factors) is a way of classifying it – essentially by associating objects (observations, individuals, cases, variables, characteristics, criteria) with these key or general factors. Associating objects with particular dimensions or factors boils down to classifying into categories represented by these dimensions or factors.

The direct consequence of the above propositions is that, conceptually, the difference between methods of classification and methods of structuring is rela­tively slim. Although traditionally observations (individuals, cases, firms) are classified, and variables (criteria, characteristics) are structured, there is no rea­son, either conceptually or technically, why variables cannot be classified or observations structured.

While there are many different ways to classify and structure data, these methods are generally grouped into two types: cluster analysis and factor analysis. The main aim of cluster analysis is to group objects into homogeneous classes, with those objects in the same class being very similar and those in different classes being very dissimilar. For this reason, cluster analysis falls into the domain of ‘taxonomy’ – the science of classification. However, while it is possible to classify in a subjective and intuitive way, cluster analyses are automatic methods of classification using statistics. ‘Typology’, ‘cluster analysis’, ‘automatic classifi­cation’ and ‘numeric taxonomy’ are actually synonymous terms. Part of the rea­son for the diversity of terms is that cluster analyses have been used in many different disciplines such as biology, psychology, economics and management – where they are used, for example, to segment a firm’s markets, sectors or strate­gies. In management, cluster analyses are often used in exploratory research or as an intermediary step during confirmatory research.

Strategic management researchers have often needed to gather organiza­tions together into large groupings to make them easier to understand. Even early works on strategic groups (Hatten and Schendel, 1977), organizational clusters (Miles and Snow, 1978; Mintzberg, 1989), taxonomies (Galbraith and Schendel, 1983) or archetypes (Miller and Friesen, 1978) were already following this line of thinking. Barney and Hoskisson (1990) followed by Ketchen and Shook (1996) have provided an in-depth discussion and critique of the use of these analyses.

The main objective of factor analysis is to simplify data by highlighting a small number of general or key factors. Factor analysis combines different sta­tistical techniques to enable the internal structure of a large number of variables and/or observations to be examined, with the aim of replacing them with a small number of characteristic factors or dimensions.

Factor analysis can be used in the context of confirmatory or exploratory research (Stewart, 1981). Researchers who are examining the statistical validity of observable measurements of theoretical concepts (Hoskisson et al., 1993;

Venkatraman, 1989; Venkatraman and Grant, 1986) use factor analysis for confirmation. This procedure is also followed by authors who have prior knowledge of the structure of the interrelationships among their data, knowl­edge they wish to test. In an exploratory context, researchers do not specify the structure of the relationship between their data sets beforehand. This structure emerges entirely from the statistical analysis, with the authors commenting upon and justifying the results they have obtained. This approach was adopted by Garrette and Dussauge (1995) when they studied the strategic configuration of interorganizational alliances.

It is possible to combine cluster and factor analysis in one study. For example, Dess and Davis (1984) had recourse to both these methods for identifying generic strategies. Lewis and Thomas (1990) did the same to identify strategic groups in the British grocery sector. More recently, Dess et al. (1997) applied the same methodology to analyze the performance of different entrepreneurial strategies.

2. Preliminary Questions

Researchers wishing to use classification and structuring methods need to con­sider three issues: the content of the data to be analyzed, the need to prepare the data before analyzing it and the need to define the notion of proximity between sets of data.

2.1. Data content

Researchers cannot simply take the available data just as they find it and imme­diately apply classification and structuring methods. They have to think about data content and particularly about its significance and relevance. In assessing the relevance of our data, we can focus on various issues, such as identifying the objects to be analyzed, fixing spatial, temporal or other boundaries, or counting observations and variables.

The researcher must determine from the outset whether he wishes to study observations (firms, individuals, products, decisions, etc.) or their characteris­tics (variables). Indeed, the significance of a given data set can vary greatly depending on which objects (observations or variables) researchers prioritize in their analysis. A second point to clarify relates to the spatial, temporal or other boundaries of the data. Defining these boundaries is a good way of judging the relevance of a data set. It is therefore extremely useful to question whether the boundaries are natural or logical in nature, whether the objects of a data set are truly located within the chosen boundaries, and whether all the significant objects within the chosen boundaries are represented in the data set. These last two questions link the issue of data-set boundaries with that of counting objects (observations or variables). Studies focusing on strategic groups can provide a good illustration of these questions. In such work, the objects to be analyzed are observations rather than variables. The time frames covered by the data can range from one year to several. Most frequently, the empirical context of the studies consists of sectors, and the definition criteria are those of official statis­tical organizations (for example, Standard Industrial Classification – SIC).

Another criterion used in defining the empirical context is that of geographic or national borders. A number of research projects have focused on American, British or Japanese industries. Clearly, the researcher must consider whether it is relevant to choose national borders to determine an empirical context. When the sectors studied are global or multinational, such a choice is hardly appro­priate. It is also valid to ask whether a sector or industry defined by an official nomenclature (for example SIC) is the relevant framework within which to study competitive strategies. One can check the relevance of such frameworks by questioning either experts or those actually involved in the system. Finally, one needs to ask whether data covering a very short time span is relevant, and to consider, more generally, the significance of cross-sectional data. When studying the dynamics of strategic groups or the relationship between strategic groups and performance, for example, it is important to study a longer time span.

As for the number and nature of observations and variables, these depend greatly on the way the data is collected. Many studies now make use of com­mercially established databases (Pims, Compustat, Value Line, Kompass, etc.), which give researchers access to a great amount of information. Determining the number of variables to include leads us to the problem of choosing which ones are relevant. Two constraints need to be respected: sufficiency and non­redundancy. The sufficiency constraint demands that no relevant variable should be omitted, and the non-redundancy constraint insists that no relevant variable should appear more than once, either directly or indirectly. These two constraints represent extreme requirements – in reality it is difficult to entirely fulfill them both, but clearly the closer one gets to fulfilling them, the better the results will be. To resolve these selection difficulties, the researcher can turn to theory, existing literature, or to expertise. Generally, it is preferable to have too many variables rather than too few, particularly in an exploratory context (Ketchen and Shook, 1996).

The problem of the number of observations to include poses the same con­straints: sufficiency and non-redundancy. For example, in the study of strategic groups, the sufficiency constraint demands that all firms operating within the empirical context are included in the study. The non-redundancy constraint insists that no firm should appear among the observations more than once. The difficulty is greater here than when determining which variables to include. In fact, the increase in diversification policies, mergers, acquisitions and alliances makes it very difficult to detect relevant strategic entities (or strategic actors). One solution lies in basing the study on legal entities. As legal entities are subject to certain obligations in their economic and social activities, this choice at least has the merit of enabling access to a minimum of economic and social information relating to the study at hand. Here, even more than in the case of variables, sector-based expertise must be used. Identifying relevant observations (such as strategic actors in a study of strategic groups) is an essentially qualitative process.

As a rule, researchers need to consider whether they have enough observations at their disposal. For factor analyses, some specialists recommend more than 30 observations per variable, and even as many as 50 or 100. Others say there must be 30 or 50 more observations than there are variables. There are also those who recommend four or five times more observations than variables. Hair et al. (1992) have argued that these criteria are very strict – they point out that quite often, researchers have to handle data in which the number of observations is hardly double the number of variables. Generally, when the number of obser­vations or variables seems insufficient, the researcher must be doubly careful in interpreting the results.

2.2. Preparing data

Preparing data ahead of applying classification and structuring methods is essentially a question of tackling the problems of missing values and outliers, and of standardizing variables.

Missing values The problem of missing values can be dealt with in a number of ways, depending both on the analysis envisaged and the number of obser­vations or variables involved.

Cluster analysis programs automatically exclude observations in which any values are missing. The researcher can either accept this imposed situation, or attempt to estimate the missing values (for example, by replacing the missing value with an average or very common value). If the researcher replaces the missing values with a fixed value – using, for instance, the mean or the mode of the variable in question – there is a risk of creating artificial classes or dimen­sions. This is because having an identical value recurring often in the data set will increase the proximity of the objects affected.

The question of missing data is, therefore, all the more important if a large number of values are missing, or if these missing values relate to observations or variables that are essential to the quality of the analysis.

Outliers The question of how to treat outliers is also an important issue, as most of the proximity measurements from which classification and structuring algorithms are developed are very sensitive to the existence of such points. An outlier is an anomalous object, in that it is very different from the other objects in the database. The presence of outliers can greatly distort analysis results, trans­forming the scatter of points into a compact mass that is difficult to examine. For this reason it is recommended that the researcher eliminates them from the database during cluster analysis and reintegrates them after obtaining classes from less atypical data. Outliers can then supplement results obtained using less atypical data, and can enrich the interpretation of these results. For example, an outlier may have the same profile as the members of a class that has been derived through analysis of more typical data. In such a case, the difference is, at most, one of degree – and the outlier can be assigned to the class whose profile it matches. Equally, an outlier may have a profile markedly different from any of the classes that have resulted from the analysis of more typical data. Here, the difference is one of nature, and the researcher must explain the particular positioning of the outlier in relation to the other objects. Researchers can use their intuition, seek expert opinions on the subject, or refer to theoreti­cal propositions which justify the existence or presence of an outlier.

Standardizing variables After attending to the questions of missing values and outliers, the researcher may need to carry out a third manipulation to prepare the data: he or she must now standardize, or normalize, his or her variables. This operation allows the same weight to be attributed to all of the variables that have been included in the analysis. It is a simple statistical operation that in most cases consists of centering and reducing variables around a zero mean, with a standard deviation equal to one. This operation is strongly recommended by certain authors – such as Ketchen and Shook (1996) – when database variables have been measured using different scales (for example, turnover, surface area of different factories in square meters, number of engineers, etc.). Although standardization is not essential if database variables have been measured using comparable scales, this has not prevented some researchers from conducting statistical analyses on untreated variables and then on standardized variables so as to compare the results. Here, the solution is to select the analysis with the greater validity.

Some specialists remain skeptical about how useful the last two preparatory steps really are (for example, Aldenderfer and Blashfield, 1984). Nevertheless, it is worthwhile for researchers to compare the results of analyses obtained with and without standardizing variables and integrating extreme data (outliers). If the results are found to be stable, the validity of the classes or dimensions iden­tified is strengthened.

2.3. Data proximity

The notion of proximity is central to classification and structuring algorithms, all of which are aimed at grouping more similar objects together and separat­ing those that are farthest removed from each other. Two types of measure­ments are generally employed to specify which measure of proximity to use: distance measurements and similarity measurements. In general, distance mea­surements are used for classification analyses and similarity measurements for factor analyses.

Researchers’ choices are greatly limited by the kind of analyses they intend carrying out and, above all, by the nature of their data (category or metric). With category data, the appropriate measurement to use is the distance of the chi-square. With metric data, the researcher can use the correlation coefficient for factor analyses and Euclidean distance for cluster analyses. Mahalanobis distance is recommended in place of Euclidean distance in the specific case of strong co-linearity among variables. It must be noted that factor analyses function exclusively with similarity measurements, whereas cluster analyses can be used with both distance measurements and, although it is very rare, similarity measurements (for example, the correlation coefficient).

In classifying observations, distance measurements will associate observa­tions that are close across all of the variables while similarity measurements will associate observations that have the same profile – that is, that take their extreme values from the same variables. It can be said that similarity measure­ments refer to profile while distance measurements refer to position.

A researcher may then quite possibly obtain different results depending on the proximity measurement (similarity or distance) used. If the results of clas­sification or structuring are stable whichever proximity measurements are used, a cluster or factor structure probably exists. If the results do not corres­pond, however, it could be either because the researcher measured different things, or because there is no real cluster or factor structure present.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Application Classification and Structuring Methods in Data Analysis

This section puts the principal methods into practice. We recommend researchers seeking a more detailed discussion of the practical application of these structuring and classifying methods refer to a specialized data analysis manual, such as Hair et al. (1992).

1. Cluster Analysis

After clearly defining the environment from which the objects to be classified have been drawn, and then preparing the data appropriately, a researcher who undertakes to conduct a cluster analysis must choose a classification algorithm, determine the number of classes necessary and then validate them.

1.1. Choosing a classification algorithm

Choosing a classification algorithm involves deciding which procedure, hier­archical or non-hierarchical, to use in order to correctly group discrete objects into classes.

Several classification algorithms exist. Two different types of procedure are commonly distinguished: hierarchical procedures and non-hierarchical procedures.

Hierarchical procedures break down a database into classes that fit one inside the other in a hierarchical structure. These procedures can be carried out in an agglomerative or a divisive manner. The agglomerative method is the most widely used. From the start, each object constitutes a class in itself. The first classes are obtained by grouping together those objects that are the most alike. The classes that are the most alike are then grouped together – and the method is continued until only one single class remains. The divisive method proceeds by successive divisions, going from classes of objects to individual objects. At the start, all of the objects constitute a single class – this is then divi­ded to form two classes, which are as heterogeneous as possible. The procedure is repeated until there are as many classes as there are different objects.

Several hierarchical classification algorithms exist. The Ward algorithm is the most often used in management research, as it favors the composition of classes of the same size. For a more in-depth discussion of the advantages and limits of each algorithm, the researcher can consult specialist statistical works, different software manuals (SAS, SPSS, SPAD, etc.), or articles that present meta-analyses of algorithms used in management research (Ketchen and Shook, 1996).

Non-hierarchical procedures – often referred to as K-means methods or iterative methods – involve groupings or divisions which do not fit one inside the other hierarchically. After having fixed the number (K) of classes he or she wishes to obtain, the researcher can, for each of the K classes, select one or several typical members – ‘core members’ – to input into the program.

Each of these two approaches has its strengths and its weaknesses. Hierarchical methods are criticized for being very sensitive to the environment from which the objects that are to be classified have been drawn, to the prepara­tory processing applied to the data (to allow for outliers and missing values, and to standardize variables) and to the method chosen to measure proximity. They are also criticized for being particularly prone to producing classes which do not correspond to reality. Non-hierarchical methods are criticized for relying completely on the subjectivity of the researcher who selects the core members of the classes. Such methods demand, meanwhile, good prior knowledge of the environment from which the objects being classified have been drawn; which is not necessarily the case in exploratory research. However, non-hierarchical methods are praised for not being over-sensitive to problems linked to the environment of the objects being analyzed – in particular, to the existence of outliers.

In the past, hierarchical methods were used very frequently, certainly in part for reasons of opportunity: for a long time these methods were the most well documented and the most readily available. Non-hierarchical methods have since become more accepted and more widespread. The choice of algorithm depends, in the end, on the researcher’s explicit or implicit hypotheses, on his or her degree of familiarity with the empirical context and on the prior exis­tence of a relevant theory or published work.

Several specialists advise a systematic combination of the two types of methods (Punj and Steward, 1983). A hierarchical analysis can be conducted initially, to obtain an idea of the number of classes necessary, and to identify the profile of the classes and any outliers. A non-hierarchical analysis using the information resulting from hierarchical analysis (that is, number and composi­tion of the classes) then allows the classification to be refined with adjustments, iterations and reassignments within and across the classes. This double proce­dure increases the validity of the classification (see Section 2, subsection 1.3 in this chapter).

1.2. Number of classes

Determining the number of classes is a delicate step that is fundamental to the classification process. For non-hierarchical procedures, the number of classes must be established by the researcher in advance – whereas for hierarchical proce­dures it is deduced from the results. While no strict rule exists to help researchers determine the ‘true’ or ‘right’ number of classes, several useful criteria and tech­niques are available to them (Hardy, 1994; Moliere, 1986; Ohsumi, 1988).

Virtually all hierarchical classification software programs generate graphic representations of the succession of groupings produced. These graphs – called dendograms – consist of two elements: the hierarchical tree and the fusion index or agglomeration coefficient. The hierarchical tree is a diagrammatic reproduction of the classified objects. The fusion index or agglomeration coefficient is a scale indicating the level to which the agglomerations are effected. The higher the fusion index or agglomeration coefficient, the more heterogeneous the classes formed.

Figure 13.1 shows an example of a dendogram. We can see that the objects that are the closest, and are the first to be grouped together, are objects 09 and 10. Aggregations then occur reasonable regularly, without any sudden rises, until the number of classes has been reduced to three. However, when we pass from three classes to two (see arrow on Figure 13.2), there is a big ‘leap’ in the fusion index (see the arrow on the graph). The conclusion is that three classes should be kept.

The researcher may be faced with situations where there is no visible leap in the fusion index, or where there are several. The first situation may signify that there are not really any classes in the data. The second signifies that several class structures are possible.

Finally, another often used criterion is the CCC (Cubic Clustering Criterion). This is a means of relating intra-class homogeneity to inter-class heterogeneity. Its value for each agglomeration coefficient (that is, each number of classes) is produced automatically by most automatic classification software programs. The number of classes to use is the number for which the CCC reaches a maxi­mum value – a ‘peak’. Several researchers have used this criterion (Ketchen and Shook, 1996).

1.3. Validating the classes

The final step in using cluster analyses is to verify the validity of the classes obtained. The aim is to ensure the classification has sufficient internal and external validity (the concept of validity is presented in detail in Chapter 10). In the case of cluster analyses, there are three important aspects to consider: reliability, predictive validity and external validity.

The reliability of the instruments used can be evaluated in several ways. The researcher can apply different algorithms and proximity measurements then compare the results obtained. If the classes highlighted remain the same, the classification is reliable (Hair et al., 1992; Ketchen and Shook, 1996; Lebart et al., 1984). Equally, one can divide a sufficiently large database into two parts and carry out the procedures on each of the separate parts. Concordance of the results is an indication of their reliability. Hambrick’s (1983) research on mature industrial environments is a good example of this method.

Predictive validity should always be examined in relation to an existing conceptual base. Thus, the many authors who have used cluster analyses to identify strategic groups would be able to measure the predictive validity of their classifications by studying the relationship between the classes they obtained (that is, the strategic groups) and performance. In fact, the strategic groups theory stipulates that membership of a strategic group has a determining influ­ence on performance (Porter, 1980). If a classification enables us to predict per­formance, it has a good predictive validity.

There are no tests specifically designed to test the external validity of cluster analyses. One can still, however, appreciate the quality of the classification by carrying out traditional statistical tests (Fisher’s F, for example) or analyses of the variance between the classes and external measurements. For example, a researcher may set about classifying industrial supply firms and find two classes; that of the equipment suppliers and that of the subcontractors. To test the validity of his or her typology, the researcher may carry out a statistical test on the classes obtained and on a variable not taken into account in the typology. If the test is significant, he or she will have strengthened the validity of the classification. If the reverse occurs, he or she will need to examine the reasons for this non-validation. The researcher could question, for instance, whether the external measurement he or she has chosen is suitable, whether there are errors in his or her interpretation of the classes and whether the algorithms he or she has chosen are consistent with the nature of the variables and his or her research method.

The external validity of a classification can also be tested by carrying out the same analysis on another database and comparing the results obtained (Hair et al., 1992). This method is difficult to use in most research designs in manage­ment, however, as primary databases are often small and it is not easy to access complementary data. It is rarely possible to divide the data into different sample groups. Nevertheless, this remains possible when the researcher is working with large secondary databases.

1.4. Conditions and limitations

There are numerous possible ways cluster analyses can be useful research tools. Not only are they fundamental to studies aimed at classifying data, but they are also regularly used to investigate data, because they can be applied to all kinds of data.

In theory, we can classify everything. But while this may be so, researchers should give serious thought to the logic of their classification strategies. They must always consider the environmental homogeneity of the objects to be clas­sified, and the reasons for the existence of natural classes within this environ­ment – and what these may signify.

The subjectivity of the researcher greatly influences cluster analysis, and represents one of its major limitations. Even though there are a number of cri­teria and techniques available to assist researchers in determining the number of classes to employ, the decision remains essentially up to the researcher alone.

Justification is easier when these classes are well defined, but in many cases, class boundaries are less than clear-cut, and less than natural.

In fact, cluster analysis carries a double risk. A researcher may attempt to divide a logical continuum into classes. This is a criticism that has been leveled at empirical studies that attempt to use cluster analysis to validate the existence of the two modes of governance (hierarchical and market) proposed by Williamson. Conversely, the researcher may be attempting to force together objects that are very isolated and different from each other. This criticism is sometimes leveled at works on strategic groups which systematically group firms together (Barney and Hoskisson, 1990).

The limitations of classification methods vary according to the researcher’s objectives. These limitations are less pronounced when researchers seek only to explore their data than when their aim is to find true object classes.

2. Factor Analysis

Factor analysis essentially involves three steps: choosing an analysis algorithm, determining the number of factors and validating the factors obtained.

2.1. Choosing a factor analysis technique

Component analysis and common and specific factors analysis There are two basic factor analysis techniques (Hair et al., 1992): ‘classic’ factor analysis, also called common and specific factor analysis, or CSFA, and component analysis, or CA. In choosing between the two approaches, the researcher should remember that, in the framework of factor analysis, the total variance of a variable is expressed in three parts: (1) common, (2) specific and (3) error. Common variance describes what the variable shares with the other analysis variables. Specific variance relates to the one variable in question. The element of error comes from the imperfect reliability of the measurements or to random component in the vari­able measured.

In a CSFA, only common variance is taken into account. The variables observed, therefore, are the linear combinations of non-observed factors, also called latent variables. Component analysis, on the other hand, takes total vari­ance into account (that is, all three types of variance). In this case it is the ‘factors’ obtained that are linear combinations of the observed variables. The choice between CSFA and CA methods depends essentially on the researcher’s objectives. If the aim is simply to summarize the data, then CA is the best choice. However, if the aim is to highlight a structure underlying the data (that is, to identify latent variables or constructs) then CSFA is the obvious choice. Both methods are very easy to use and are available in all major software packages.

Correspondence factor analysis A third, albeit less common, type of factor analysis is also possible: correspondence factor analysis (Greenacre, 1993;

Greenacre and Blasius, 1994; Lebart and Mirkin, 1993). This method is used only when a data set contains categorical variables (that is, nominal or ordinal). Correspondence factor analysis was invented in France and popularized by Jean-Paul Benzecri’s team (Benzecri, 1992). Technically, correspondence factor analysis is similar to conducting a CA on a table derived from a categorical data­base. This table is derived directly from the initial database using correspon­dence factor analysis software. However, before beginning a correspondence factor analysis, the researcher has to categorize any metric variables present in the initial database. An example of such categorization is the transformation of a metric variable such as ‘size of a firm’s workforce’ into a categorical variable, with the modalities ‘small’, ‘medium’ and ‘large’. All metric variables can be transformed into a categorical variable.

Otherwise, aside from the restriction of it being solely applicable to the analysis of categorical variables, correspondence factor analysis is subject to the same constraints and same operating principles as other types of factor analysis.

2.2. Number of factors

Determining the number of factors is a delicate step in the structuring process. While again there is no general rule for determining the ‘right’ number of fac­tors, a number of criteria are available to assist the researcher in approaching this problem (Stewart, 1981). We can cite the following criteria:

A priori specification’ This refers to situations when the researcher already knows how many factors need to be included. This approach is relevant when the research is aimed at testing a theory or hypothesis relative to the number of factors involved, or when the researcher is replicating earlier research and wishes to extract exactly the same number of factors.

‘Minimum restitution’ The researcher fixes in advance a level corresponding to the minimum percentage of information (that is, of variance) that is to be con­veyed by all of the factors retained (for example, 60 per cent). While in the exact sciences, percentages of 95 per cent are frequently required, in management percentages of 50 per cent and even much lower are often considered satisfac­tory (Hair et al., 1992).

The Kaiser rule According to the Kaiser rule, the researcher includes only those factors whose eigenvalues (calculated automatically by computer soft­ware) are greater than one. The Kaiser rule is frequently applied in manage­ment science research – although it is only valid without restrictions in the case of a CA carried out on a correlation matrix. In the case of a CSFA, the Kaiser rule is too strict. According to this rule, the researcher can retain a factor whose eigenvalue is less than one, as long as this value is greater than the mean of the variables’ communalites (common variances). This rule gives the most reliable results for cases including from 20 to 50 variables. Below 20 variables, it tends to reduce the number of factors, and above 50 variables, to increase it.

Example: Factors and associated eigenvalues

Table 13.1 presents the results of a factor analysis. Eleven variables characterizing 40 firms were used in the analysis. For each variable, communality represents the share of common variance. The first six factors are examined in the example. According to the Kaiser rule, only the first four factors must be retained (they have an eigenvalue of more than 1). In total, these first four factors reproduce 77.1 per cent of the total variance.

The eigenvalues are classified in decreasing order and any definite leveling- out of the curve is noted. The number of factors to include is then the number corresponding to the point where this leveling-out begins. Factor analysis software packages can generate a graphic visualization – called a ‘scree plot’ or a ‘scree test’ – of eigenvalues, which facilitates detection of such leveling. Figure 13.3 shows an example of a ‘scree plot’. It represents the eigenvalues of the first 14 factors resulting from a CA. We note that, after the fourth factor, the eigenvalues stabilize (see arrow on Figure 13.3). The number of factors to retain is, therefore, four.

Factor interpretation is at the heart of factor analysis; notably CSFA, where it is often important to understand and sometimes to name the latent variables (that is, the factors). One frequently used technique is rotation. Rotation is an operation that simplifies the structure of the factors. Ideally, each factor would load with only a small number of variables and each variable would load with only a small number of factors, preferably just one. This would enable easy dif­ferentiation of the factors. A distinction needs to be made between orthogonal rotations and oblique rotations. In an orthogonal rotation, the factors remain orthogonal in relation to each other while, in an oblique rotation, this constraint is removed and the factors can load with each other. The rotation operation consists of two steps. First, a CA or a CSFA is carried out. On the basis of the previously mentioned criteria, the researcher chooses the number of factors to retain; for example, two. Rotation is then applied to these factors. We can cite three principal types of orthogonal rotation: Varimax, Quartimax and Equamax.

The most widespread method is Varimax, which seeks to minimize the number of variables strongly loaded with a given factor. For each factor, variable corre­lations (factor loading) approach either one or zero. Such a structure generally facilitates interpretation of the factors, and Varimax seems to be the method that gives the best results. The Quartimax method aims to facilitate the inter­pretation of the variables by making each one strongly load with one factor, and load as little as possible with all the other factors. This means that several vari­ables can be strongly loaded with the same factor. In this case, we obtain a kind of general factor linked to all the variables. This is one of the main defects of the Quartimax method. The Equamax method is a compromise between Varimax and Quartimax. It attempts to somewhat simplify both factors and variables, but does not give very incisive results and remains little used. Oblique rotations are also possible, although these are given different names depending on the software used (for example, Oblimin on SPSS or Promax on SAS). Oblique rota­tions generally give better results than orthogonal rotations.

To interpret the factors, the researcher must decide on which variables are significantly loaded with each factor. As a rule, a loading greater than 0.30 in absolute value is judged to be significant and one greater than 0.50 very signifi­cant. However, these values must be adjusted in relation to the size of the sample, the number of variables and factors retained. Fortunately, many software pack­ages automatically indicate which variables are significant.

Example: Matrix, rotations and interpretation of the factors

Tables 13.2 and 13.3 follow from the factor analysis results presented above in Table 13.1. These tables reproduce the standard output of factor analysis software. We should remember that the first four factors were retained according to the Kaiser rule (eigenvalue greater than one). Table 13.2 presents the matrix of the factors before rotation. One can conclude that the variables ‘assets’, ‘workforce’ and ‘turnover’ are strongly and essentially loaded to Factor 1 and that the variable ‘financial profitabil­ity’ is strongly and essentially loaded to Factor 2. However, the other variables are strongly loaded to several factors at once. Such a situation makes interpretation rela­tively difficult. It can, therefore, be useful to proceed to a factor rotation.

Table 13.3 presents the matrix of factors after a Varimax rotation. One can conclude that the variables ‘assets’, ‘workforce’ and ‘turnover’ are always strongly and essen­tially loaded to Factor 1. The variables ‘economic profitability’, ‘financial profitability’ and ‘margin’ seem to be strongly and essentially loaded to Factor 2. The variables ‘export’ and ‘international’ as well as, to a lesser degree, the variable ‘communica­tion’, are strongly and essentially loaded to Factor 3. Finally, the variable ‘R&D’ and, to a lesser degree, the variable ‘USA’ are strongly and essentially loaded to Factor 4. In conclusion, the interpretation of the factors is simplified: Factor 1 repre­sents ‘size’, Factor 2 ‘profitability’, Factor 3 ‘internationalization policy’ and Factor 4 ‘research and development policy’.

2.3. Validation

The final step in a factor analysis involves examining the validity of the factors obtained. The same methods that may be applied to increase the reliability of cluster analyses (that is, cross-correlating algorithms, dividing a database) can also be used for factor analyses.

Factor analyses are often aimed at identifying latent dimensions (that is, variables that are not directly observable) that are said to influence other variables. In strategic management, for example, numerous ‘factors’ have been found to influence the performance of companies. These include strategy, organizational structure, planning, information and decision-making systems. Researchers wanting to operationalize such factors (that is, latent variables) could study the predictive validity of the operationalizations obtained. For example, researchers who undertake to operationalize the three ‘generic strategies’ popularized by Porter (1980) – overall low cost, differentiation and focus – would then be able to examine the predictive validity of these three factors by evaluating their rela­tionship to the firms’ performances.

Researchers can test the external validity of their factor solutions by repli­cating their study in another context or with another data set. That said, in most cases it would not be possible for the researcher to access a second empirical context. The study of external validity can never simply be a mechanical opera­tion. A thorough preliminary consideration on the content of the data to be analyzed, as developed in the first section of this chapter, can provide a good basis for studying the external validity of a factor analysis.

2.4. Conditions and limitations

Factor analysis is a very flexible tool, with several possible uses. It can be applied to all kinds of objects (observations or variables) in various forms (tables of metric or categorical data, distance matrices, similarity matrices, con­tingency and Burt tables, etc.).

As with cluster analyses, the use of factor analysis entails a certain number of implicit hypotheses relating to the environment of the objects to be struc­tured. Naturally, there is no reason why the factors identified should necessar­ily exist in a given environment. Researchers wishing to proceed to a factor analysis must, therefore, question the bases – theoretical or otherwise – for the existence of a factor structure in the particular environment of the objects to be structured. Most factor analysis software automatically furnishes indicators through which the probability of a factor structure existing can be determined and the quality of the factor analysis assessed. Low quality is an indication of the absence of a factor structure or the non-relevance of the factorial solution used by the researcher.

Finally, it must be noted that the limitations of factor analysis vary depend­ing on the researcher’s objectives. Researchers wishing simply to explore or synthesize data have much greater freedom than those who propose to find or to construct underlying factors.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

General Presentation and Data Collection for Social Network Research

Network analysis essentially involves revealing the links that exist between units. These units can be individuals, actors, groups, organizations or projects.

When researchers decide to use network analysis methods, and to look at links between units of analysis, they should be aware that they are implicitly enter­ing into the paradigm of structural analysis. Structural sociology proposes going beyond the opposition that exists in sociology between holistic and individual­istic traditions, and for this reason gives priority to relational data.

Analysis methods can be used by researchers taking a variety of differ­ent approaches, whether inductive, hypothetico-deductive, static or dynamic. Different ways of using the methods are presented in the first part of this section. Researchers may find they face difficulties that are linked as much to data collec­tion as to the sampling such collection requires. These problems are dealt with in the second part.

1. Using Social Network Analysis Methods

A researcher may often be led to approach a particular management problem by using social network analysis in many different ways. The flexibility of these methods means they can be used inductively or to test a conceptual framework or a set of hypotheses. Moreover, recent developments in network analysis have made it easier to take the dynamics of the phenomena into account.

1.1. Inductive or hypothetico-deductive approaches

The descriptive power of network analysis can make it a particularly apt tool with which to seek a better understanding of a structure. Faced with a reality that can be difficult to grasp, researchers need tools that enable them to inter­pret this reality, and the data processing methods of network analysis are able to meet this need. General indicators or a sociogram (the graphic representation of a network) can, for example, help researchers to better understand a network as a whole. Calculation of the centrality score or detailed analysis of the sociogram can then enable the central individuals of the structure to be identi­fied. Finally, and still using a sociogram or the grouping methods (which are presented in the second section of this chapter), researchers may reveal the exis­tence of strongly cohesive subgroups within the network (individuals strongly linked to each other), or of groups of individuals who have the same relation­ships with the other members of the network. Network analysis can thus be used as ‘an inductive method for describing and modeling relational structure [of a network]’ (Lazega, 1994: 293). Lazega’s case study of a firm of business lawyers (see the example below) illustrates the inductive use of network analy­sis. Studying cliques within this company (groups of individuals in which each is linked to every other member of the clique) enabled him to show how organi­zational barriers are crossed by small groups of individuals.

With an inductive approach, it is often advisable to use network analysis as a research method that is closely linked to the collection of qualitative data. In fact, as Lazega (1994) underlines, network analysis often only makes sense
when qualitative analysis is also used – to provide the researcher with a real understanding of the context, so that the results obtained can be properly understood and interpreted.

Network analysis is by no means restricted to inductive use. It can, in fact, enable a large number of concepts to be operationalized. There is a great deal of research in which structural data has been used to test hypotheses. For example, centrality scores are often used as explanatory variables in studies of power within organizations. In general, all the methods used in network analysis can be used hypothetico-deductively. Not only are there methods aimed at draw­ing out individual particularities, but researchers can also use the fact that someone belongs to a subgroup in an organization or network as an explana­tory or explained variable. This is what Roberts and O’Reilly are doing when they use a structural equivalence measurement to evaluate whether individuals are ‘active participants’ or not within the United States’ Navy (see example below).

Example: Hypothetico-deductive use of network analysis

In their study conducted within the US Navy, Roberts and O’Reilly (1979) focused on the individual characteristics of ‘participants’ – people who play a communica­tive role in the organization. Three networks were analyzed, each relating to com­munication of different kinds of information. One network related to authority, another was a ‘social’ network (unrelated to work) and the third related to exper­tise. The individual characteristics used were: rank, level of education, length of time they had been in the navy, need to accomplish, need for power, work satis­faction, performance, involvement and perceived role in communications. The hypotheses took the following form: ‘participants have a higher rank than non­participants’ or ‘participants have greater work satisfaction than non-participants’. A questionnaire was used to collect both structural data and data on individual characteristics. Network analysis then enabled participant and non-participant individuals to be identified. Actors were grouped into two classes of structural equivalence – participants and non-participants – according to the type of rela­tionship they had with the other people questioned. The hypotheses were then tested by means of discriminant analyses.

1.2. Static or dynamic use of network analysis methods

Network analysis methods have often been criticized for being like a photograph taken at one precise moment. It is, in fact, very rare for researchers to be in a position to conduct a dynamic study of a network’s evolution. This can only be achieved by reconstituting temporal data using an artefact (video-surveillance tapes, for example) or by having access to direct observations on which to base the research. More often, the network observed is a photograph, a static obser­vation. Static approaches can nevertheless lead to very interesting results, as the two examples presented below clearly show. However, references to the role time plays within networks can be found in the literature. Following on from Suitor et al. (1997), many works examine the dynamics of networks, princi­pally asking how this can be taken into account so as to improve the quality of the data collected.

For example, in certain studies, data is collected at successive moments in time. Researchers then present the evolution of the network as a succession of moments in the same way that the construction of discrete variables approxi­mates a continuous phenomenon. This type of research could be aimed at improving our understanding of the stability of networks, and their evolution over time. Here, the researcher seeks to describe or explain the way in which the number of links, their nature or even their distribution evolves. As an illus­tration, we can cite Welman et al. (1997). Taking a longitudinal approach, they show that neither an individual’s centrality nor a network’s social density are linked to the preservation of the links between actors. Similarly, the fact that a link is strong does not mean it will be long-lasting.

Collecting dynamic data about networks also enables researchers to carry out work for which the implications are more managerial than methodologi­cal. For example, Abrahamson and Rosenkopf (1997) propose an approach that enables researchers to track the diffusion of an innovation. Here, the approach is truly dynamic and the research no longer simply involves succes­sive measurements.

2. Data Collection

Data collection is a delicate phase of network analysis. The analyses carried out are, in fact, sensitive to the slightest variations in the network analyzed. Leik and Chalkley (1997) outline some potential reasons for change: the instability inherent in the system (for example, the mood of the people questioned), change resulting from the systems’ natural dynamics (such as maternity leave) and external factors (linked, for example, to the dynamics of the sector of acti­vity). It is, therefore, essential to be extremely careful about the way in which data is collected, the tools that are used, the measurements taken to assess the strength of the links, and how the sample is constructed (how to define the network’s boundaries).

2.1. Collection tools

Network analysis is about relationships between individual or collective units. The data researchers obtain is relational data. In certain situations the collection of such data poses no particular problem – it may be possible to collect data using secondary sources, which often prove to be entirely reliable. This is the case, for example, with research that makes use of the composition of boards of directors or inter-company cooperative ventures. The fact that this type of data is often used indicates that it is reliable relational information that is often rele­vant and easily accessible. For example, Mizruchi and Stearns (1994) use data on the boards of directors of large American companies taken from Standard and Poor’s and Moody’s annual directories. These directories enabled them to find out who was on the boards of directors and trace the boards’ evolution. The authors were particularly interested in the presence of bank representa­tives on these boards.

Direct observation is sometimes feasible. One might, for example, study inter­actions in a particular place such as an office or a tea-room. However, in prac­tice such possibilities for direct observation are quite rare. It is also possible to use certain relational artefacts, such as the minutes of meetings.

Most research based on network analysis uses surveys or interviews to collect data. It is, in fact, difficult to obtain precise data about the nature of the relationships between individuals in the network being analyzed by any other means. The obvious advantage of surveys is that they can reach a large number of people. However, the data collected is often more ‘flimsy’ than that obtained in interviews. The researcher’s presence during the interview means he or she can reply directly to questions raised by the respondent during the research. In an interview situation researchers can also make sure that respondents fully understand what is asked of them and that, from start to finish, they are seri­ous in their replies.

Name generators Whether researchers use surveys or conduct interviews, relational data can be collected by means of ‘name generators’. A name genera­tor is a question either about the links the person being questioned has with the other members of the network or about his or her perception of the links that exist between members of the network. Table 14.1 gives some examples of name generators used in management research. Several collection techniques are possible.

One technique involves asking the respondent to cite the people concerned, possibly in order of importance (or frequency). Researchers can facilitate res­ponses by supplying a list of names (all the members of the organization, for example). Another technique involves asking who they would choose as reci­pients if a message had to be sent to people with a certain profile.

The use of name generators is a delicate matter. They rely on the respon­dents’ capacity to recall the actors with whom they are linked in a given type of relationship (for example, the people they have worked with over the past month).

Valued data Not solely the existence, but also the strength of a link can be evaluated. These evaluations produce a valued network. If a flow of informa­tion or products circulates among the actors, we also talk about flow measure­ments. There are a number of ways of collecting this type of data.

The time an interaction takes, for example, can be included in a direct obser­vation of the interaction. Certain secondary data can enable us to assess the strength of a link. If we return to the example of links between boards of direc­tors, the number of directors or the length of time they are on the board can both be good indicators. The following example illustrates another way that secondary data can be used to evaluate the strength of links.

In the many cases in which surveys or interviews are used, researchers must include a specific measurement in their planning to evaluate the strength of rela­tional links. They can do this in two ways. They can either introduce a scale for each relationship, or they can classify individuals in order of the strength of the relationship. The use of scales makes the survey or interview much more cum­bersome, but does enable exact evaluation of the links. Likert scales are most com­monly used, with five or seven levels ranging, for example, from ‘very frequent’ to ‘very infrequent’ (for more information on scales, see Chapter 9). It is easier to ask respondents to cite actors in an order based on the strength of the relation­ship, although the information obtained is less precise. In this case, the researcher could use several name generators, and take as the relationship’s value the num­ber of times an individual cites another individual among their n first choices.

Biases To conclude this part of the chapter, we will now examine the different types of biases researchers may face during data collection.

First, there is associative bias. Associative bias occurs in the responses if the individuals who are cited successively are more likely to belong to the same social context than individuals who have not been cited successively. Such a bias can alter the list of units cited and, therefore, the final network. Research has been carried out recently which compares collection techniques to show that this type of bias exists (see, for example, Brewer, 1997). Burt (1997) recom­mends the parallel use of several redundant generators (that is, relating to relationships of a similar nature). This redundancy can enable respondents to mention a different name, which can in turn prompt them to recall other names.

Another problem researchers face is the non-reciprocity of responses among members of the network. If A cites B for a given relationship, B should, in princi­ple, cite A. For example, if A indicates that he or she has worked with B, B should say that he or she has worked with A. Non-reciprocity of responses can be related to a difference in cognitive perception between the respondents (Carley and Krackhardt, 1996). Researchers must determine in advance a fixed decision­making rule for dealing with such non-reciprocity of data. They might, for example, decide to eliminate any non-reciprocal links.

The question also arises of the symmetry of the relationships between individuals. One might expect, for example, that links relating to esteem should be symmetrical. If A esteems B, one might assume that B esteems A. Other relationships, like links relating to giving advice or lending money, are not necessarily reciprocal. It is, however, risky to assume, without verification, that a relationship is symmetrical. Carley and Krackhardt (1996) show that friend­ship is not a symmetrical relationship.

2.2. Sampling: determining the boundaries and openness of a network

Choosing which individuals to include and setting network boundaries is a delicate area in network analysis. It is very rare for a network to present clearly demarcated natural boundaries. Networks make light of the formal bound­aries we try to impose on organizations (structures, flow charts, job definitions, localization, etc.). Consequently, there is bound to be a certain degree of sub­jectivity involved when researchers delimit the network they are analyzing. Demarcation of the area being studied is all the more important because of the very strong influence this has on the results of the quantitative analyses being conducted (Doreian and Woodard, 1994). Moreover, research into the ‘small world’ phenomenon shows that it is possible, on average, to connect two people chosen at random anywhere in the world by a chain of six links – or six degrees. It is clear, therefore, that if a network is not controlled, it very quickly leads us outside any organizational logic.

For their data to be of any real use, researchers must restrict themselves to a relatively limited field of investigation and obtain the agreement of all (or almost all) of the individuals who enter within this field. While researchers can begin their investigations without having their field of study written in stone, they do have to determine its boundaries sufficiently early on.

According to Laumann et al. (1983), it is possible to specify the boundaries of a network while using either a realist or a nominalist approach. In a realist approach, the researcher adopts the actors’ point of view, and when taking a nominalist approach the researcher adopts certain formal criteria in advance (for example, taking only the first three people cited into account). In both cases, the boundary can be defined in terms of the actors themselves, the rela­tionships between them, or their participation in an activity (a meeting, for example). As a general rule, the research question should serve as a guide in defining the boundaries.

Researchers sometimes need to go beyond this way of thinking about boundaries to take into account the openness of networks. Networks are often analyzed as if they are closed – the actors are considered to be a closed group. However, in many situations this presupposition poses problems.

Doreian and Woodard (1994) propose a systematic process that enables researchers to control sample snowballing. Their method enables researchers to build a sample during the course of their research and to avoid closing off the boundaries of the networks too early, while keeping a check on the individuals included in the network. The first stage consists of obtaining a list of actors included in the network, using strict realist criteria. Those on the list are then asked about the other actors who are essential to their network. Of these new actors, the researcher includes only those who meet a nominalist criterion (for example, taking into account the first three actors cited) – the strictness of the criterion used is determined by the researcher. This criterion is used until no new actors can be included in the network. It is the control the researcher has on the strictness of the nominalist criterion used that enables researchers to halt any snowballing of the sample, and which sets the limits of the network while taking its openness into account from the start.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Analysis Methods in Social Network Research

Once researchers have collected sociometric data (data that measures links between individuals), and so know which actors have links with which other actors, they then need to reconstitute the network, or networks, of actors before they can conduct a global network analysis (or analyses). The methods of analysis fall into two main categories. In the first, the researcher’s aim is to identify homogenous groups within the network. In the second, the focus is on individual particularities and the researcher concentrates on each actor’s posi­tion within the network.

1. Formalizing Data, and Initial Analyses

1.1. From adjacency matrix to sociogram

In most cases, computer software is used to process data. The first step is to put the data into the form of a matrix. A matrix is produced for each type of rela­tionship (supervision, work, friendship, influences, financial flow, flow of mate­rials, etc.). The matrix can then be used to obtain a graphic representation.

To construct an adjacency matrix, all of the actors involved are listed along both the columns and the rows of the matrix. If individual A has a relationship with individual B, a number 1 is placed in the corresponding space at the inter­section of line a and column b. If the relationship is directed, its direction must be taken into account. For example, the researcher might study simply the fact that individual A worked with B during the previous three months, or they may study whether A supervised the activity of individual B during this period. In the first case, the working relationship is not directed. A and B worked together and the adjacency matrix is, therefore, symmetrical. The number 1 is placed at the intersection of line a and column b and at the intersection of line b and column a. In the case of the network of supervision, however, the rela­tionship is directed. While A supervises B, B does not necessarily supervise A. The number 1 is simply placed at the intersection of line a and column b and one obtains a non-symmetrical adjacency matrix.

Once the adjacency matrix has been constructed, it can be represented in the form of a graph. The graph obtained in this way is a sociogram. Figure 14.1 gives an example of an adjacency matrix for a directed network, and its corresponding sociogram.

Sociograms enable us to make a certain number of summary interpreta­tions, and can be sufficient for analyzing simple networks. In the example of Figure 14.1, we can immediately identify C as an important actor. If the relation­ship being studied is one of advice, C is probably an expert. If it is a supervision relationship, C is probably a departmental head. However, once the size of the networks involved increases, visual interpretation of the graph becomes particu­larly difficult. It also becomes hazardous in the sense that the choices made in arranging the elements of the graph have a strong influence on the way it can be interpreted. Researchers need standardized tools that enable systematic analysis.

1.2. General analyses

In analyzing the general structure of a network, indicators can help researchers to assess the overall reality of the structure of the relationships between individuals.

As an example of how indicators can be used, we present below the two most frequently used indicators: density and multiplexity. We will then show how a number of networks can be compared.

Density of a network In any given network, density corresponds to the rela­tionship between the number of existing links and the number of possible links. By existing links we mean those that the researcher has been able to reveal. Possible links refer to all the links that could have existed, taking into account the number of individuals involved. Thus, for n individuals, there are n(n – 1)/2 possible links.

The density measurement does not say very much in itself. The researcher can simply make a subjective judgement on the value obtained. However, den­sity becomes particularly interesting if we wish to compare different groups of individuals or different organizations.

To give an example, the communication network in Production Unit A of an organization may be more dense than that in Production Unit B. If we hypothe­size that the density of the communication network is a factor of performance, we can conclude, from a descriptive viewpoint, that Unit A performs better than Unit B. We could also test this hypothesis by seeing if there is a link between the density of a department’s communications network and the evaluation that members of upper management make of that department’s performance.

Network multiplexity Multiplexity relates to the existence of different kinds of links between individuals. Let us take the example of the relationships within a group of company directors. Between the directors of two companies there can exist relationships of trust, friendship or even guidance. The more dimen­sions the relationship involves, the more it is said to be multiplex. If n is the number of different links existing between the units being studied (individuals, companies, etc.) and p is the number of units cited as being linked, the degree of multiplexity is the relationship n/p. This general indicator requires delicate handling. It does not take into account the distribution of multiplexity within the network – two networks can have the same degree of multiplexity but very different structures. For example, similar multiplexity is obtained in the two following cases. In the first case, a minority of individuals have very multiplex links (that is, involving direction, influence, assistance and advice) while the others have simple links. In the second case, most of the individuals have just two kinds of link (for example, assistance and advice).

Other indices It is possible to compare networks using other means than gen­eral indices that measure either density or multiplexity. Researchers can assess how much overlap there is between the networks. Software packages can cal­culate the number of common links between adjacency matrices. Random net­works can be generated and compared to the empirical data. This comparison enables researchers to assess the specific nature of the data they collect: does the network obtained have a direction (an underlying order) or does it present the same characteristics as a network constructed ‘by chance’? Software packages can also carry out statistical internetwork analyses. Ucinet IV, for example, can analyze the correlation between two different networks. This tool can be used to test whether it is relevant to use relationships about which there is available information to approximate relationships that are difficult to evaluate. For example, is it relevant, in a given context, to evaluate the trust between a net­work’s units (information that is difficult to collect on a wide scale) from the flow of information between them? If, when using a test sample, we notice that the network of trust is strongly correlated with the telephone communications network (taken as a proxy for information flow), then the answer is yes.

2. Grouping Methods

Analyzing the general structure of a network provides useful initial informa­tion, but researchers quite often need to go beyond this type of analysis. There are a series of methods available that enable researchers to group together individuals within a network. Using these methods, they can identify collective actors, according to an essential grouping principle: cohesion. A second princi­ple, that of equivalence, is used to group together individuals who occupy similar positions within the network.

2.1. Strongly cohesive groups

The principle of cohesion involves grouping together individuals within a net­work on the basis of them being ‘close’ to each other – of distinguishing subgroups by their strong density. Research into subgroups within a network corresponds in general terms to a desire to reveal the existence of ‘collective’ actors (for example, a dominant coalition) within an organization, and to study how the relationships between these actors are structured.

A ‘collective’ actor is often represented in the world of networks by a clique – a set of individuals who are all interconnected. In the example below we present a sociogram of cliques. Within a clique, links are direct (all the individuals are linked to each other, without an intermediary). Generally, for a clique to exist at all, all the individuals must be connected by strong links. This means that if the relationship is directed, the links must go in both directions (from A towards B and from B to A). However, it is possible to define subgroups using indirect links. This is the case when one reveals an w-clique. In an w-clique, all the individuals are connected by a number of links that is less than w. This means that the link between two individuals in the w-clique passes at most by (w – 1) individuals. N-cliques enable researchers to find organizational subgroups using criteria that are less strict than those governing cliques. Another approach consists of dis­regarding the direction of a directed relationship, in which case weak components are obtained. However, neither the w-clique nor the clique of weak components are really satisfactory. They are rarely meaningful in organizational reality. The criteria used actually make it difficult to interpret the groups that are constituted. How, for example, can a researcher constitute a group while disregarding the direction of a supervision relationship? Or how can a group be defined around the concept of links that are longer than 1 (the length being defined by the number of links, or degrees, separating two units)?

However, there is a real need to find ‘collective actors’ within networks using more flexible criteria than those of the clique (direct and strong links) and without resorting to simplistic solutions like those of n-cliques or weak com­ponents (Frank, 1995).

The most commonly used software packages offer one solution to this, by either establishing criteria about the minimum number of interactions each actor must have with the other actors in the group, or by fixing a maximal value for the number of missing interactions. Another, similar, criterion involves estab­lishing how many individuals must be removed from the group for its members to become disconnected (Borgatti et al., 1992). In this way the researcher isolates those blocs in the matrix that necessitate the removal of n individuals to become disconnected.

Other commonly used criteria include co-membership of cliques. The fact that two individuals belong to a significant number of the same cliques is considered to be an indication of their social closeness. The criterion of co­membership of cliques, or any other closeness criterion, can be applied systema­tically by means of ascendant algorithms (grouping begins from individual) or descendant algorithms (starting from the whole of the network which is then subdivided). These algorithms can vary in their degree of sophistication. Soft­ware packages offer solutions using these criteria and the appropriate algorithms in different ways. Some of the procedures enable this type of grouping to be effected for valued graphs.

When using this type of grouping method, a common problem is the fre­quent need to fix limits for the criteria used. Limits should be fixed according to the context, without referring to any objective criteria. Moreover, using limits does not always enable groups to be formed without overlap (Frank, 1995). It may be preferable to choose methods that involve fixing a limit that corresponds to the data used rather than those that require an arbitrary limit.

With all these methods, the principle is to maximize the intra-group interactions and minimize those that are intergroup. A vast range of possibili­ties is available to the researcher – it is not possible for us to describe here all the procedures that exist in the literature. However, it seems that, in practice, there are relatively few procedures that can be used for any particular type of data. In fact, restrictions are frequently made on the type of data that can be used for each of the procedures proposed by software packages. For example, the procedure presented below is suitable for a valued network, but not for a directed graph.

2.2. Equivalence classes

Researchers can also try to group together individuals that have the same kind of links with other members of the network. This is called equivalence. However, the members of an equivalence class are not necessarily linked to each other. The example below presents a study using the notion of structural equivalence.

Grouping by equivalence classes can be used to take into account the con­cept of social role and status. If we take the example of the different positions held within a company, we can suppose that each worker has similar relation­ships with individuals in other classes (executives, senior managers, floor man­agers, etc.). Grouping by equivalence classes allows us to identify classes of individuals who play the same role, independently of the one formally defined by their status and job description.

The main point of this process is to create classes that are defined not in themselves, but according to the relationship that links their members to other individuals. We can distinguish between structural, regular and automorphic equivalence.

Structural equivalence occurs when all the elements of one class have rela­tionships with all the members of another class. In the army, for example, all the subordinates must show respect for their superiors.

Regular equivalence corresponds to the situation in which, if a member of class 1 is linked to a member of class 2, all the members of class 1 must have a link with at least one member of class 2, and all the members of class 2 must have a link with at least one member of class 1. In a factory, for example, each supervisor is in charge of at least one worker and each worker is under the charge of at least one supervisor.

Two individuals belong to the same automorphic equivalence class if it is possible to switch their positions in the network around and reconstitute a net­work that is isomorphic to the original – that is, with exactly the same form as the initial network. This situation arises when the networks of two actors are exactly symmetrical. One can imagine, for example, two project managers in a company finding themselves in a situation of automorphic equivalence.

It seems clear that the type of equivalence sought depends directly on the problem being studied and the research question. Figure 14.3 illustrates struc­tural, regular and automorphic equivalence.

In reality, it is rare to find classes that correspond precisely to any one of these three types of equivalence, and the strict application of one of the three definitions only rarely results in classes that can be interpreted in terms of social roles. It is generally more relevant to use one of the many statistical approxima­tion procedures that software programs offer, and we will now give an overview of these.

A number of these procedures are designed to respect the logic of structural equivalence. This postulates that members of a single class within a network all have exactly the same links with the other actors in the network. This means that the adjacency matrix rows for these actors will be identical. The statistical approximation methods most often used in this case consist of grouping actors that resemble each other the most into one class. To do this, the closeness of the individuals is evaluated – by calculating, for example, a Euclidean distance, a correlation coefficient, or the number of common links shared by rows of the adjacency matrix. Once this has been achieved, the individuals are put into groups with the aid of a classification method (see Chapter 13 for a discussion of classification methods). This type of process can be applied to valued graphs or used when researching multiplex relationships.

Other algorithmic procedures aim to approach regular equivalence. They proceed by making successive comparisons of all the pairs of individuals. The researcher compares each pair of individuals with all the other pairs in the net­work and then evaluates whether they can be placed in the same equivalence class. These comparisons serve as a basis for calculating an index showing the closeness between pairs. A classification method can then be used.

There are also several ways of grouping together individuals according to the principles of automorphic equivalence. For example, it is possible to use geodesic equivalence to approximate automorphic equivalence. The geodesic path is the shortest path between two individuals. It is represented by its length (the number of links separating two units). Geodesic approximation involves calculating the length of the geodesic paths that link each individual to each of the other individuals in the network. Two individuals are considered to be equivalent if they present the same types of geodesic path; for example, if they are both linked to two individuals by a path of length 1, to three others by a path of length 2, etc.

Researchers are advised to be particularly careful when deciding upon the methods they will use to approach equivalence. While many methods can be used with valued graphs, these may not always be applicable to non-valued graphs.

3. Methods for Demonstrating the Notion of Centrality

Another aim of network analysis is to concentrate on particular actors and establish the role that their structural position permits them to play within the organization.

Literature on networks often focuses on individuals in central positions. We can generally suppose that the actors who play a key role within the network are able to draw a certain advantage from it. There are many ways of defining cen­trality and the corresponding algorithms vary in their degree of sophistication.

Freeman (1979) draws a distinction between centrality of degree, of close­ness and of ‘betweenness’.

Centrality of degree Centrality of degree corresponds to the number of con­nections an individual has. Individuals are considered to be central if they are strongly connected to other members of the network. In other words, the
centrality of degree index for each individual A is equal to the number of direct relationships he or she has. This index is purely local. It depends neither on the characteristics of the network as a whole, nor on the characteristics of the other individuals to which individual A is linked. However, an individual who has numerous links with marginal individuals is much less central than an indi­vidual whose links are with individuals who are themselves central. To resolve this difficulty, an index of relative or normed centrality is calculated for each individual, by dividing the absolute centrality scores by the maximal centrality possible for the graph. This concept is explained below.

Centrality of closeness Centrality of closeness is an assessment of an individual’s centrality through an evaluation of their closeness to all the other individuals in a network. It is a more global measurement, which takes into account not only the connections individuals have with their immediate neighbors, but also their closeness to all the other members of the network.

One measurement used is the sum of the geodesic distances linking one point to all the other points of the graph. Geodesic distance is the shortest path linking two individuals on a graph. The centrality of an individual can be mea­sured by comparing its total geodesic distance from other individuals with that of other actors in the network – the smaller the figure, the greater the indivi­dual’s centrality. As we demonstrate below, other measurements based on the same principle have been also been proposed.

Centrality of ‘betweenness’ Evaluation of centrality can also be based on the individual’s role as a ‘go-between’. Freeman (1979) proposed that an individual may be only slightly connected to others (weak centrality of degree), but prove to be an essential intermediary in exchanges. An individual’s ‘betweenness’ vis-a-vis two other people is defined by the frequency with which he or she appears on the geodesic path or paths (that is, of minimal length) linking the two others. The values of the centrality index of ‘betweenness’ vary between 0 and 1, and can be compared between different networks.

Other measurements of centrality As Hage and Harary (1995) have shown, models of centrality using the concepts of centrality of degree, closeness and ‘betweenness’, have been used in many research works; studying power in infor­mal exchange networks (Hage and Harary, 1983) and social stratification in commercial networks (Hunt, 1988; Irwin, 1983; Milicic, 1993). Nonetheless, the authors also proposed supplementary measures.

In the case of a network where each individual is linked to at least one other individual, Hage and Harary (1995) propose introducing the idea of eccentri­city. By eccentricity, the two authors mean the maximal distance that separates a given individual from other individuals. Using the notion of eccentricity, one can calculate the diameter of a network, which is equal to the maximal eccen­tricity existing within it. Similarly, the notion of radius is understood as the minimal eccentricity. An individual is, therefore, central if his or her eccentricity is equal to the radius of the network. Another way of measuring centrality was proposed by Bonacich (1987).

For comparative purposes, the researcher needs to be in a position to judge the network’s overall centralization. There is a marked difference between being a unit at the center of a decentralized network and one at the center of a centralized network. When judging the network as a whole, therefore, we talk about its centralization. This indicator always varies between 0 and 1, and indi­cates the degree to which the maximal centrality is greater than the centrality of all the other points. When it equals 0, all the centralities are equal. When it equals 1, one point dominates the centralization of the network. This situation corresponds to a star.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Fundamentals of Longitudinal Analyses

1. Definition and Role of Time

Longitudinal analyses form a group of analyses focusing on the study of pheno­mena over the course of time. Longitudinal analyses are often contrasted with cross-sectional studies, by arguing that the data collected for longitudinal analyses relate to at least two distinct points in time, while for cross-sectional studies data are collected in relation to a distinct moment. This distinction, while ostensibly correct, does however raise some problems, and therefore needs to be refined.

First, it should be noted that this dichotomy is only a simplification of a more complex phenomenon. Thus, a pure cross-sectional analysis would require the data to be collected at one particular moment, or at least over a period suffi­ciently short to be considered as such. However, in management, data collection often extends over a relatively long period, frequently lasting for several months. We then must question whether this is still a cross-sectional collection. Similarly, this makes it necessary to explain how the time that has elapsed might have an impact on the study. Take, for example, a research study that seeks to investigate the perception two different managers in the same company have of a past investment. If a week passes between interviewing one manager and interviewing the second, we would wonder whether this time lapse had had an impact on the study or not – that is, whether the responses of the second manager would have been the same a week earlier. It is quite possible that during the week something might have happened to change her perception.

Drawing from Menard’s (1991) definition, we can recognize the following three characteristics:

  • the data relate to at least two distinct periods;
  • the subjects are identical or at least comparable from one period to the next;
  • the analysis consists basically in comparing data between (or over the course of) two distinct time periods, or in retracing the observed evolution.

Depending on the research, time may be attributed an important role or it may be relegated to a secondary level. At the extremes of this continuum we would find studies strongly influenced by the time factor, and studies of the development of a phenomenon without particular reference to time. It is there­fore essential that researchers consider the significance they wish to attribute to time in their project, to ensure that the research design will enable the research question to be answered.

When time is an important factor in the research, it might be considered in terms of either duration or chronology. Duration corresponds to an interval between two points in time, and is measured in terms of the subject at hand. According to the phenomenon being studied, duration may be expressed in seconds, hours, days, years, etc. For example, it could concern the duration of the development of an innovation, or the time lapse between a takeover bid and a restructuring. Chronology, however, is external to the subject of the study, existing outside the research. Chronology is used to determine the order of occurrence of events, and in management research it is generally expressed by dates.

Finally, another possible use of time is in terms of cohorts. The concept of cohorts is drawn from demographics, where a cohort refers to a group of indi­viduals born at the same period (birth cohort). Generalizing from this we can define a cohort as a group of observations having experienced the same event on a particular date. Determining a cohort in a research project allows researchers to make multiple comparisons. We can measure the differences between cohorts, or the evolution of a cohort. Table 15.1 presents a summary of these different possibilities.

2. Preliminary Questions

A researcher who decides to undertake longitudinal research is confronted with two major questions: what period of time should be covered by the study, and how many points in time should data be collected on over this time period? While the answer to the first question depends on the research question, the second question is linked to the importance the research places on time.

2.1. Analysis period

Determining the analysis period requires fixing the limits of the time interval in which data are to be gathered. In setting these limits several elements must be considered.

The first element is the research question. This provides us with the infor­mation needed to determine the phenomenon that is to be studied, and thus to determine the study period. It is always up to researchers themselves to set the time limits of their studies, while ensuring that the data they collect will enable them to answer their research question.

The researcher can also question the continuity of the phenomenon being studied: is it regarded as a permanent element in the life of an organization or is it only temporary? To fully understand this distinction let us look at the topic of change. In some studies, change is considered as forming an integral part of the life of an organization (a permanent phenomenon). Conversely, when change is seen as a particular event (a temporary phenomenon) it is studied over a limited period in the evolution of an organization. This period could be extended, depending on the research question, from the realization of the necessity for change to the stabilization of an organization after that change has been implemented.

2.2. Data collection points

Data collection points refer to moments in the life of the phenomenon for which data are to be collected. They do not necessarily coincide with the moments the researcher collects data. For example, the data may be collected on a single occa­sion, some time after the event, when the phenomenon has passed completely.

Given that a study qualifies as longitudinal on the basis of having two or more data collection points, the problem arises as to whether it must be limited to those two points, or whether this number should be increased. In the latter case, then the time interval separating them needs to be determined.

Limiting data collection to two points in time comes down to carrying out a study using a pre-test, post-test design. The choice of collection points can therefore have a strong impact on the results (Rogosa, 1988). Furthermore, this type of research does not enable the process of the phenomenon’s evolution to be studied, that is, what happens between what we observe ‘before’ and what we observe ‘after’? If one wants to focus on analyzing a process, it is therefore appropriate to increase the number of data collection points.

When researchers increase the number of data collection points, however, they are still confronted with the problem of how much time should separate these various points. To determine this time interval they once again need to consider the place of time in longitudinal research. There are three possible scenarios here:

  • When time is not important, collection intervals depend on the evolution of the phenomenon being studied. In this case these intervals are irregular and vary according to the successive states of the phenomenon or the occurrence of events affecting this phenomenon.
  • When time (in terms of duration) is a key element in the research, the period that elapses between two events is measured by predefined units: hours, days, years, etc. In this case data must be collected regularly, respecting this period.
  • When time (in terms of chronology) is important, it has a starting point that is common to all the observations (usually a date). In theory, continuous data collection is necessary so as to be able to note the dates each event occurs (other dates correspond to non-occurrences). These occurrences can then be accurately positioned in the context of time. In practice this type of collection can be difficult to implement, in which case the researcher can gather information within the context of regular periods, and reconstitute the chronology of the events afterwards.

There is a fourth case, however, although this is transversal to the three pre­ceding case scenarios. Time can be used to class individuals into cohorts – groups of individuals (or, more generally, of observations) that experienced a common event on a particular date. Once a cohort has been identified, the posi­tion of time can vary according to the research question, as before.

Table 15.2 summarizes longitudinal research designs according to the above case scenarios.

3. Problems Related to Data Collection

Researchers deciding to carry out longitudinal research can choose between:

  • collecting data retrospectively, and therefore studying a past phenomenon
  • collecting data in real time, on a phenomenon that may occur or on a pheno­menon as it occurs.

3.1. Problems related to retrospective data collection

Retrospective studies (concerning a past phenomenon) draw upon archived secondary data and/or primary data retracing the evolution of a phenomenon after the events (mainly retrospective interviews).

The secondary data that is necessary for retrospective research raise two types of problems: accessibility and validity. In its most acute form, the prob­lem of accessibility can make it absolutely impossible for the researcher to obtain the information required for the research. This information might not exist, not have been preserved, be impossible to find, or be refused (explicitly or implicitly). The question of the validity of documents, when they can be obtained, also arises. The original purpose of the document is one important initial consideration, as biases, intentional or not, might have been introduced into it by its author. Documents should also be considered in the context in which they were written. The organization of a company may have been dif­ferent, the way in which certain indices were calculated could have changed – factors such as these make comparisons precarious.

Primary data is generally in the form of retrospective interviews, which can be influenced by two important biases: faulty memory and rationalization after the fact. By faulty memory we mean that the person questioned may not remember certain events, either intentionally (he or she does not want to remember) or unintentionally (the phenomenon relates to an unremarkable event which he or she has forgotten). Rationalization too may be either inten­tional (a desire to present things in a positive light) or not (an unconscious ‘tidying up’). To add to the problem, these two biases are not mutually exclu­sive. Memory lapses and rationalization have created doubt about using retro­spective interviews (Golden, 1992), although their supposed limitations have been hotly debated. Miller et al. (1997) argue that the validity of retrospective interviews relies above all on the instrument used to gather the data.

There are several ways researchers can limit the effects of these biases. To limit errors resulting from memory lapses the following methods are recommended: [1]

  • If the research question permits, interviews can focus on events that are relatively memorable for the people being questioned, or interviewees can be selected according to their degree of involvement in the phenomenon being studied (Glick et al., 1990).
  • Information given in different interviews can be compared, or secondary data can be used to verify it (Yin, 1990).
  • A non-directive interview method can be used, in which interviewees are not pushed to answer if they don’t remember (Miller et al., 1997).
  • Each interview can be transcribed or recorded, so that interviewees can add to their initial statements.

Several strategies can be employed to limit rationalization after the fact:

  • Interviewees can be asked to list events chronologically before they are asked to establish causal connections between them.
  • Information given in different interviews can be compared.
  • Dates of events can be verified through secondary sources.

3.2. Problems related to collecting longitudinal data in real time

The collection of data in real time consists in studying a phenomenon at the same time as it is happening.

As the collection of data in real time extends over a given period, the problem arises of how to interpret any changes or developments that may be observed. Should they be attributed to the phenomenon itself or to the measuring instru­ment used (in collecting and analyzing the data)? When a measuring instrument is used on successive occasions there is a risk of it falsifying the observations. To avoid such a bias administration conditions must not vary from one period to another. It is incumbent on the researcher to control external variables that might influence responses to the questions. These variables include, for example, the person who administers the survey, external events that may occur between two collection points, and the context in which the survey is administered.

A second source of bias emerges when the first wave of data collection leads to the introduction of new hypotheses, or to the modification of existing ones. In extreme cases the original hypotheses may be brought into question between two successive rounds of data collection. In this situation the researcher has no choice but to take these new hypotheses into account, while ensuring that the data collected before the initial hypotheses were modified as appropriate to the new questions that have been raised. If this is not the case, the information will not be included in the study.

3.3. General problems related to collecting longitudinal data

Longitudinal research also presents the more general problem of the evolution of the variables used to explain the successive modifications of a phenomenon (Menard, 1991). Explanatory variables are prone to variation between the time they are collected and the time the phenomenon occurs. If the impact of such an evolution might falsify the results of the study, the researcher will have to change the data collection strategies used or use an analysis taking this into account. For example, data collection could instead be spaced out over the period of the study.

Longitudinal studies that involve several organizations face the problem of having to take the life cycles of these organizations into consideration (Kimberly, 1976). Comparative studies, for instance, would be best carried out using data from comparable periods in the organizations’ life cycles.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Quantitative Longitudinal Analyses

A large number of quantitative longitudinal methods are not specific to longitu­dinal analysis, and therefore will not be developed in detail here. An example of this is regression. Nevertheless, often additional conditions have to be respected when applying these methods to longitudinal analyses – notably concerning error terms, which should be homoscedastic, and free of auto­correlation. If this condition is not respected, either the model will need to be adjusted, or procedures that are not reliant on the variance-covariance assump­tions must be used (for more details, see Bergh and Holbein, 1997).

1. Event Analysis Methods

Longitudinal analyses are often chosen because a researcher is particularly con­cerned with certain events that may affect the life of an organization. When the event itself is the object of study, the event history analysis method is used. If, however, the event is not the subject of the study, but the research seeks to dis­cern the influence of a certain event on the actual subject, the event study method would be used.

1.1. Event history analysis

Any relatively sudden qualitative change that happens to an individual or to an organization can be classed as an event. Examples of events that may occur include an individual being promoted or dismissed, or an organization experi­encing a strike or a takeover bid.

One could consider using regression for an event study. However, regres­sion entails a risk of introducing a bias into the results, or of losing information. In fact, the data that is collected possesses two characteristics that violate the assumptions underlying ‘classic’ regression (Allison, 1984). The first of these characteristics appears when data have been collected on the possible causes of an event: explanatory variables can change over the observation period. Imagine, for example, a study trying to identify how much time will pass between when a company begins exporting and when it establishes a presence abroad. Those companies that began exporting in 1980 could be used as a sam­ple, observing in each case whether they have established a presence abroad, and if so, when this occurred, along with possible explanations. For example, sales figures achieved overseas could possibly result in consolidating a com­pany’s position there. The problem we encounter here is that the explanatory variables could change radically during the study. A company that established itself overseas in 1982 could sustain a sharp increase in export sales figures because of this fact. Sales figures, from being a cause of the foreign develop­ment, then become a consequence of it.

The second characteristic of the collected data that prevents the use of clas­sical regression is known as censoring: this refers to the situation when data collection has been interrupted by the end of the study. In our example, infor­mation is gathered on the establishment of companies overseas from 1980 to the present. At the end of the studied period, certain companies will not have established themselves overseas, which poses a major problem. We do not know the value of the explanatory variables for these companies (the period between beginning to export and establishing a presence). This value is then said to be ‘censored’. The only possible solutions to such a situation could lead to serious biases: to eliminate these companies from the sample group would strongly bias it and make the results questionable; to replace the missing data with any particular value (including a maximum value – the time from 1980 to the end of the study), would minimize the real value and so once again falsify the results.

Another important concept when analyzing event history concerns the risk period (Yamaguchi, 1991). The time in which the event is not happening can generally be broken into two parts: the period in which there is a risk the event will happen, and the period in which occurrence of the event is not possible. For example, if the studied event is the dismissal of employees, the risk period corresponds to the time the studied employees have a job. When they do not have a job, they are at no risk of losing it. We can also speak in terms of a risk set to refer to those individuals who may experience the event.

Two groups of methods can be used to study transition rates: parametric and semi-parametric methods on the one hand, and non-parametric methods on the other. The first are used to develop hypotheses of specific distribution of time (generally of exponential distributions, such as Weibull or Gompertz), and aim to estimate the effects of explanatory variables on the hazard rate. Non­parametric methods, however, do not generate any hypotheses on the distribu­tion of time, nor do they consider relationships between the hazard rate and the explanatory variables. Instead, they are used to assess the hazard rate specific to a group, formed according to a nominal explanatory variable that does not change over time.

1.2. Event study

The name ‘event study’ can be misleading: it does not refer to studying a par­ticular event, but rather its impact on the object of the research. Event study methods measure how a dependent variable will evolve in the light of the occurrence of a particular event. The evolution of this variable is therefore mea­sured both before and after the occurrence of the event. Through an OLS regres­sion, the researcher tries to estimate what the value of the dependent variable would have been on the date the event occurred, had that event not taken place. The difference between the calculated value and the real value is called abnor­mal return, and it is considered to correspond to the effect of the event. This abnormal return is generally standardized by dividing it by its standard devi­ation. However, it can happen that the effect of the event takes place on a dif­ferent date (before the event, if it was anticipated, or after it, if the event had a delayed impact). The impact might also be diluted over a somewhat longer period, called an event window – in which case we can calculate the sum of the differences between the estimated values and those observed in the event win­dow. If the aggregate difference is not zero, the following test will be used to verify whether it is significant:

McWilliams and Siegel (1997) emphasize the precautions that must be taken with such types of analysis. First, the sample should be reasonably large, as the test statistics used are based on normality assumptions. Moreover, OLS regres­sions are very sensitive to outliers, which should therefore be identified, espe­cially if the sample is small. Another major difficulty concerns the length of the event window. It should be as short as possible, to exclude disruptive elements, external to the study, while still remaining long enough to capture the impact of the event. Finally, the abnormal returns should be justified theoretically.

2. Sequence Methods

Sequence methods are used to study processes. One type of research that can be conducted using these methods consists in recognizing and comparing sequences. We could, for example, establish a list of the different positions held by CEOs (chief executive officers) of large companies over their careers, as well as the amount of time spent at each post. The sequences formed in this way could then be compared, and typical career paths could be determined.

Another type of research could aim at determining the order of occurrence of the different stages of a process. For example, the classic decision models indicate that in making decisions we move through phases of analyzing the problem, researching information, evaluating the consequences and making a choice. But in reality, a single decision can involve returning to these different steps many times over, and it would be difficult to determine the order in which these stages occur ‘on average’.

2.1. Comparing sequences

Sequence comparison methods have to be chosen according to the type of data available. We can distinguish sequences according to the possibility of the recurrence or non-recurrence of the events they are composed of, and accord­ing to the necessity of knowing or not knowing the distance between these events. The distance between events can be assessed by averaging the tempo­ral distance between them across all cases, or on the basis of a direct categori­cal resemblance, or by considering transition rates (Abbott, 1990).

The simplest case is a sequence in which every event is observed once and only once (non-recurrent sequence). In this case, two sequences can be compared using a simple correlation coefficient. Each sequence is arranged in order of occurrence of the events it is composed of, and the events are numbered accord­ing to their order of appearance. The sequences are then compared two by two, using a rank correlation coefficient. The higher the coefficient (approaching 1) the more similar the sequences are. A typical sequence can then be established: the sequence from which the others differ least. A typology of possible sequences could also be established, using a typological analysis (such as hierarchical or non-hierarchical classification, or multidimensional analysis of similarities). This procedure does not require measuring the distances between events.

The most frequently used measure of correlation is the Spearman rank cor­relation coefficient. It is calculated in the following manner:

where

di = distance between the two classifications of event i

n = number of classified events.

This coefficient assumes that the ranks are equidistant – in other words, that the distance between ranks 3 and 4 is equal to the distance between ranks 15 and 16, for example. For cases in which it seems that this hypothesis is not appropriate, we can use the Kendall’s tau, which is calculated as follows:

where

na = number of agreements between two classifications (any pair of objects classed in the same rank order both times)

nd = number of discrepancies between two classifications (any pair of objects classed in different rank-order)

N = number of possible pairs.

Third, if the classifications reveal any tied events, an index such as Goodman and Kruskal’s gamma should be used, with the following formula:

In the case of recurring sequences, the most popular approach is to use a Markov chain, or process, which postulates that the probability of an event occurring depends entirely on its immediate predecessor. A Markov chain is defined with the conditional probabilities that make up the matrix of transitions. This matrix groups estimations based on observed proportions (the percentage of times an event is followed by another – or by itself, on the matrix diagonal).

Another possibility for recurring sequences consists of a group of tech­niques called optimal matching. The optimal matching algorithm between two sequences begins with the first sequence and calculates the number of addi­tions or suppressions necessary to produce the second sequence (Abbott and Forrest, 1986). The necessary transitions are weighted according to the distance between the events. We then obtain a matrix of distance between sequences, which can be used in sequence comparisons.

2.2. Determining order of occurrence

Other types of research can be used to determine the order of occurrence of events. In this case the researcher wants to identify a general pattern from a mass of events. One of these methods was proposed by Pelz (1985), who applied it to analyzing innovation processes. The method is based on Goodman and Kruskel’s gamma, and allows the order of observed events to be estab­lished, as well as defining to what extent these events overlap.

The calculation method comprises the following steps:

  • The number P of times that event A happens before event B is counted.
  • The number Q of times that event B happens before event A is counted.
  • The gamma is calculated for each pair of events as follows:

γ is between +1 and -1.

  • Repeating the process for each pair of events enables a squared gamma matrix to be established, with a number of lines equal to the total number of events.

From this gamma matrix the researcher can calculate a time sequence score, which determines in which order the events took place, and a separation score, which indicates whether the events are separated from one another or if they overlap.

The time sequence score is obtained by calculating the mean from the columns of gamma values. In this way, a score ranging from + 1 to – 1 is ascribed to each event. By reclassifying the events according to their score, in diminishing order, the events are ordered chronologically.

The separation score is obtained by calculating the mean from the columns of absolute gamma values. Each event is credited with a score of between 0 and 1. It is generally considered that an event for which the score is equal to or above 0.5 is clearly separated from those that surround it, whereas an event with a score lower than 0.5 cannot be separated from the events that surround it and therefore must be grouped with these events (Poole and Roth, 1989).

Interpretation of Pelz’s gamma, in which a high gamma indicates that two events are separate, rather than associated, is opposite to the interpretation of Goodman and Kruskal’s gamma, on which the method is based. This is because it is calculated using the two variables time and the passage of event A to an event B. A high gamma therefore, indicates that the passage from A to B is strongly associated with the passage of time. The interpretation that can be drawn is that A and B are strongly separated.

An important advantage of this method is that it is independent of the time elapsed between events: therefore it is not necessary for this information to be available. In fact, the results do not change if the interval of time between the two incidents differs. The only point that has to be observed in relation to time is chronological order.

3. Cohort Analysis

Cohorts represent groups of observations having in common the fact that they have experienced the same event within a given period of time. The event in question is frequently birth but could be any notable event. The period of this event may extend over a variable duration, often between one and ten years. But for very dramatic events it can be considerably reduced. Cohort analysis enables us to study changes in behavior or attitudes in these groups. We can observe three types of changes: changes in actual behavior, changes due to aging, or changes due to an event occurring during a particular period (Glenn, 1977). We can distinguish intra-cohort analysis, focusing on the evolution of a cohort, from intercohort analysis, in which the emphasis is on comparisons.

3.1. Intra-cohort analysis

Intra-cohort analysis consists in following a cohort through time to observe changes in the phenomenon being studied. Let us imagine that we want to study the relationship between the age of a firm and its profitability. We could select a cohort, say that of companies created between 1946 and 1950, and follow them over the course of time by recording a profitability measure once a year. This very simple study of a trend within a cohort does, however, raise several problems. First, a certain number of companies will inevitably drop out of the sample over time. It is, in fact, likely that this mortality in the sample will strongly bias the study, because the weakest companies (and therefore the least profitable) are the most likely to disappear. Another problem is that intra-cohort analyses generally use aggregated data, in which effects can counterbalance each other. For example, if half the companies record increased profits, while the other half record a decrease in their profits of equal proportions, the total effect is nullified. Methods originally developed for studying panel data can, however, be used to resolve this problem.

A third problem raised by intra-cohort studies is that our study will not enable us to ascertain the true impact of company age on profitability. Even if we observe a rise in profitability, we will not know if this is due to the age of the company or to an external event – an effect of history such as a particularly favorable economic situation. Other analyses are therefore necessary.

3.2. Inter-cohort analysis

Among the other analyses just mentioned, one method is to compare several cohorts at a given time. In our case, this could bring us to compare, for example, the profitability in 1990 of companies created at different periods. In this we are leaving the domain of longitudinal analysis, though, as such a study is typically cross-sectional. Second, this design on its own would not enable us to resolve our research question. In fact, any differences we might observe could be attributed to age, but also to a cohort effect: companies established in a certain era may have benefited from favorable economic circumstances and may continue to benefit from those favorable circumstances today.

3.3. Simultaneous analysis of different cohorts

The connection between age and profitability of companies can be established only by simultaneous intra- and inter-cohort analysis. Changes observed in the performance levels of companies could be due to three different types of effects: the effects of age (or that of aging, which is pertinent in this case), cohort effects (the fact of belonging to a particular cohort), and period effects (the time at which profitability is measured). To try to differentiate these effects, a table can be estab­lished with rows representing the cohorts and with the observation periods recorded in columns. Where it is possible to separate data into regular intervals this should always be preferred. The period between readings should also be used to delimit the cohorts. For example, if the data easily divides into ten-year intervals, ten-year cohorts are preferable – although this is not always possible, it does make for better analyses. The resulting table can be used to complete intra­cohort and inter-cohort analyses. If identical time intervals have been used for the rows and the columns (for example, ten-year intervals), the table presents the advantage that the diagonals will give the intra-cohort tendencies (Glenn, 1977).

All the same, the differences observed between the cells of the table have to be analyzed with caution. First, these differences should always be tested to see if they are statistically significant. Second, the findings may have been biased by the mortality of the sample, as we mentioned earlier. If the elements that have disappeared did not have the same distribution as those that remain, the structure of the sample will change. Finally, it is very difficult to differentiate the three possible effects (age, cohort, and period), because they are linearly dependent, which poses problems in analyses such as regression, where the explanatory variables must be independent. Here, in the case of a birth cohort, the three factors are linked by the relationship:

cohort = period – age

A final possibility consists in recording each age, each cohort, and each period as a dummy variable in a regression. However, this leads us to form the hypo­thesis that the effects do not interact: for example, that the effect of age is the same for all the cohorts and all the periods, which is usually unrealistic (Glenn, 1977).

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Qualitative Longitudinal Analysis

The mass of information collected when doing qualitative research can be impressive. But this mass of data cannot simply be analyzed directly – it must first be manipulated and put into usable form.

1. Preliminary Processing of Qualitative Longitudinal Data

Very often the first question for researchers using qualitative longitudinal data is how to reduce the copious amount of information they have amassed. The risk of drowning in data is real, and a number of authors have proposed simple techniques for consolidating and summarizing qualitative data, or reducing the amount of data, before analysis.

1.1. Monograph

The first preliminary step in processing qualitative longitudinal data is to write a monograph. A monograph traces the development of the phenomenon being studied, over the analysis period defined by the researcher. It gives a transver­sal view of the phenomenon while reducing the amount of information that has been accumulated during the data collection phase. Often taking the form of a descriptive narrative, a monograph can be accompanied by an initial analysis of the data (Eisenhardt, 1989); perhaps in the form of a graph of relationships between the events, in which chronological order will be respected.

When research focuses on several organizations (or on several case studies), monographs also provide the basic elements for comparison. They enable indi­vidual development patterns to emerge which, when compared with each other, facilitate the identification of common characteristics.

1.2. Time-ordered matrices

The time-ordered matrix proposed by Miles and Huberman (1994), presents information relating to a phenomenon by establishing a temporal relationship between the variables it is composed of. The aim is to understand quickly and easily what has happened. Such a matrix has its columns arranged in sequence, so that one can see when a given phenomenon occurred. The basic principle is chronology (Miles and Huberman, 1994).

In constructing time-ordered matrices, researchers begin by determining the specific components or aspects of the phenomenon they are studying. These form the rows of the matrix. The columns represent successive periods – that is, the period of analysis divided into subperiods, or successive stages of development. The intersection between the line and the column shows the changes that have occurred to a component or to an aspect of the phenomenon over the course of a given period.

This chronological matrix enables the researcher to pinpoint shifts or important modifications that have been experienced by components of the phenomenon.

Miles and Huberman (1994) also propose longitudinal variants of the basic time-ordered matrix: the role-by-time matrix and the time-ordered meta-matrix. In the first, the lines represent individuals. This matrix allows us to determine at which moment an action was carried out by a protagonist occupying a particular role in relation to the phenomenon being studied. The time-ordered meta-matrix compares the evolution of a particular phenomenon for several cases at the same time. The different case studies make up the rows and the time intervals of the period of analysis form the columns. This matrix can be constructed in relation to the studied phenomenon as a whole, or one of its components alone.

The tools presented above pave the way for the analysis. In fact, it is diffi­cult to determine which is an element of the preliminary processing of the data, and which forms part of the analysis itself.

2. Qualitative Analysis of Longitudinal Data

Qualitative longitudinal analysis methods are rarely formalized. However, cer­tain authors have proposed general procedures that can be used to analyze the evolution of a phenomenon.

2.1. Analyzing a phenomenon in terms of time

Van de Ven and Poole (1989) proposed a method that can be broken into four steps:

  1. Put together a chronological list of events that occurred during the course of the studied phenomenon. An ‘event’ is understood to mean a change experienced by one of the conceptual categories studied.
  2. Rearrange this list according to the conceptual categories of the research, in order to establish, for each category, a chronological series of events – which is called a trajectory. The set of trajectories gives us a description of the process studied.
  3. Carry out a phase analysis. This consists in identifying discrete phases of activity and analyzing their sequences and properties. A phase is defined as being a meaningful set of simultaneous activities within the trajectories established in the second stage. Thus, a phase is a set of changes undergone by a certain number of conceptual categories.
  4. Examine the order of sequences in the series of connected events.

2.2. Concepts describing evolution: stages, cycles, phases, and sequences

Stages We speak of a stage in the evolution of a phenomenon to characterize a moment in this evolution. The stage can sometimes signify a provisional stopping point. All evolution is essentially a succession of stages.

Cycles This can have two different meanings. A cycle can be seen as a recur­rent succession of steps giving cadence to the evolution of a system by always returning it to its original state, as in the cycle of the seasons. This is known as cyclical evolution. We can also describe as a cycle the evolution of a pheno­menon that follows a fixed order without necessarily being recurrent – as in the life cycle, where every creature is born, grows and dies. This is called an evolu­tion schema.

Both of these types of cycles have been identified in organizational theory. Cyclical evolution is found, for example, when successive periods of stability and change are observed. Evolution schema can be seen in the recognition of evolutionary constants in the life of organizations. The cycle then expresses a permanent organizational phenomenon. It can be broken up into phases that represent different stages of the evolution of the organization.

Phases The concept of the phase is very close to that of the cycle as understood in its second meaning: that is to say, as a succession of stages which always occur in the same order. Phases are temporary phenomena in the life of the organization (for example, the phases in the development of a new product). They generally follow on from each other in a given, irreversible, order, but they can overlap. Phases are composed of fairly unified activities that carry out a function necessary for the evolution of the phenomenon (Poole, 1983). By working from an overview of the phenomenon, the researcher tries to deter­mine a relatively limited number of phases that take place in a definite order.

Sequences A sequence is defined as an ordered succession of events or objects. This order, as defined by Abbott (1990), may be temporal or spatial (although we are here only concerned with temporal order). A sequence may be either continuous or discrete.

When the objects observed are phases in the evolution of a phenomenon, the development model obtained using sequence methods is identical to that obtained by phase analysis. However, the order of the events (or phases) is not irreversible and shows more complex evolutions, such as retroactive looping, permutations, recurrent and non-recurrent events, etc.

2.3. Concepts describing dynamics: dynamic factors rupture points

The passage from one event to another or from one phase to another within an evolving phenomenon is not always stable and linear. Evolving phenomena are subject to interference, cycles, rupture points, etc. These factors of dynamics can create accelerations, slowdowns, reversals, or ruptures within the evolution of a single phenomenon.

In their research on decision processes, Mintzberg et al. (1976) identify six dynamic factors:

  • interrupts, which are caused by environmental forces, and cause a suspen­sion of the evolution of the phenomenon
  • scheduling delays, which permit managers who are under strong time pres­sures to break complex processes down into manageable steps
  • feedback delays. These characterize periods in which managers are waiting to see the results of actions that have already been engaged upon before undertaking other actions
  • timing delays and speedups, which result from the intervention of managers wanting perhaps to seize an opportunity or to create a surprise effect, or to wait for more favorable conditions or gain time
  • comprehension cycles, which enable a better understanding of a complex problem by going over it numerous times
  • failure recycles, which lead the decision-maker to slow down the process while waiting for an acceptable solution when none have proved satisfactory so far, or to change the criteria relating to a problem to make one of the proposed solutions acceptable.

Rupture points, which represent transitions between the main trends in the development of a phenomenon, are also factors in the phenomenon’s dynam­ics. Poole (1983) distinguishes three types of rupture points:

  • Normal points, which result from a process that can be described as ordi­nary. These include, for example, adjournments of a decision process opera­ting within a small group.
  • Delays (or cycles of comprehension), which signify a period in which the observed phenomenon is suspended. These periods are important, as they can signal either the beginning of a difficult phase, or a time of great crea­tivity. The actors involved in the evolution of the phenomenon are generally unable to anticipate these rupture points.
  • Ruptures (or interruptions), which characterize an internal conflict, or the arrival of unexpected results. Ruptures result in a reorientation of the evolution of the observed phenomenon..

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Analyzing Representations and Discourse

Research in management and organizational science often relies on the analysis of communications, either oral (conversations, individual or group interviews) or written (annual reports, strategic plans, letters to shareholders, etc.). Researchers may simply want to analyze the content or the structure of these communica­tions; or they may attempt to establish, through the text or discourse, the author’s representations or thought processes. Inspired in particular by the cognitive approach to organizations, many researchers today are developing an interest in individuals’, groups’ or organizations’ representations. In a very broad sense, by ‘representation’ we mean the structure composed of the beliefs, values and opinions concerning a specific object, and the interconnections between them. This structure is supposed to enable individuals to impose coherence on infor­mation received, and therefore to facilitate its comprehension and interpreta­tion. From this point of view, in order to understand the decisions and actions taken by an organization, one first needs to apprehend the representations of the actors with whom they originate.

Thus, discourse and documents are believed to transmit some of the represen­tations of organizational members, or their interests and concerns. Researchers can turn to different methods to enable them to reduce and analyze the mass of data contained in the discourse and documents. These methods were developed as an alternative to subjective interpretation, and to avoid running the risk of filtering or deforming the information.

We do not intend here to present all possible methods for analyzing representations and discourse. Indeed, these methods, which come from such varied domains as linguistics, social and cognitive psychology, statistics and artificial intelligence (Stubbs, 1983; Roberts, 1997) are particularly numerous. We will restrict ourselves to presenting those methods that are used the most in management research: content analysis and cognitive mapping.

Content analysis Content analysis is based on the postulate that the repetition of units of analysis of discourse (words, expressions or similar signifiers, or sen­tences and paragraphs) reveal the interests and concerns of the authors of the dis­course. The text (written document or transcription of an interview or speech) is broken down and rearranged in terms of the units of analysis that the researcher has decided to study, according to a precise coding methodology. The analyses will be based on the classification of the different units of analysis into a limited number of categories related to the objectives of the study. These analyses usually involve counting, statistical analysis or more qualitative analysis of the context in which the words appear in the discourse.

Content analysis can be used, among other things, to analyze responses to open-ended survey questions, to compare different organizations’ strategies through their discourse and the documents which they distribute, or to discern the interests of different individuals, groups or organizations.

Cognitive mapping This method, which stems from cognitive psychology, has been used frequently in management since the late 1970s (inspired in particu­lar by the work of Axelrod, 1976). The objective of this method is to establish and analyze cognitive maps, that is the representation of a person or an organi­zation’s beliefs concerning a particular domain (Axelrod, 1976). A cognitive map is composed of two elements:

  1. Concepts, also called constructs or variables; ideas that describe a problem or a particular domain.
  2. The links between these concepts.

Once they are collected, these concepts and relations can be represented graphically in the form of knots and arrows: the knots standing for the concepts (or categories) and the arrows symbolizing the links between these elements.

A cognitive map is supposed to be precise enough to capture the person’s perceptual filters and idiosyncratic vision (Langfeld-Smith, 1992). Although it is not aimed at representing the subject’s thought processes, the beliefs it reveals are considered to be at the their root.

Cognitive mapping is principally used in management:

  • To study the representations of individuals, especially managers, to explore their vision. These studies often attempt to compare different people’s repre­sentations, or those of the same person over the course of time; to explain or predict behavior; or to assist executives in formulating strategic problems.
  • To establish and study a group, organization or a sector’s representation. In this case, the studies’ goals are to understand either the evolution of corporate strategy over a period of several years, or the interactions and influence of different groups of managers.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Methods of Discourse and Representation Analysis

Discourse and representation analysis methods generally require three major steps: data collection (Subsection 1), coding (Subsection 2), and analysis (Subsection 3).

1. Collecting Discourse or Representations

There are two main types of methods for collecting representations or dis­course: structured (or a priori) and non-structured methods.

1.1. Structured or a priori methods

The objective of structured methods is to directly generate a subject’s represen­tation concerning the problem or theme that the researcher is interested in. These representations are established graphically in the form of cognitive maps (for example, Figures 16.2 and 16.3 later in this chapter). Structured methods are not based on natural discourse data. They can not, therefore, be used for content analysis (method based on coding ‘natural’ textual or interview data).

With structured methods, the researcher establishes the subject’s represen­tation using a predefined framework (whence the term ‘a priori for this type of method). Whatever type of representation the researcher wishes to generate, structured methods require two steps:

  1. The researcher chooses a set of categories (to establish category schemes) or concepts – also known as constructs or variables – (to establish cognitive maps). A category is a class of objects that are supposed to share similar attributes. If we are concerned, for example, with an executive’s representa­tion of the competitive environment in the textile sector, the following cate­gories might be selected: ‘textile firms’, ‘firms selling trendy fabric’, ‘classics’, ‘firms with top-quality merchandise’, ‘lower-quality’, ‘successful firms’, ‘less successful’, etc. A concept, a construct or a variable is an idea that is likely to describe a particular problem or domain, and that can acquire dif­ferent values or represent the level, presence or absence of a phenomenon or an object (Axelrod, 1976), for example, ‘corporate cost-effectiveness’, ‘the manager has a strategic mind’, or ‘the employees are highly adaptable’.
  2. Once a certain number of elements have been selected (generally around ten), the researcher submits them to the respondent and asks what kind of link he or she sees between them: hierarchical (is Category A included in Category B?); similarity or difference (used to establish category schemes); or influence or causal links (does A influence B? If so, how: positively or negatively), which are used to establish, for example, cognitive maps.

The advantage of structured methods is that they generate reliable data: researchers will obtain the same type of data if they use the same methods on other subjects or the same subjects on different occasions (stability), and if other researchers use these methods, they will also get the same results (replicability) Laukkanen, 1992: 22). These methods do not require data coding, they spare the researcher from a tremendous amount of pre-collection work and from the reli­ability problems related to this phase. They are therefore usable on a large scale. But the main advantage of these methods is that they generate representations emanating from the same set of initial concepts or categories. Representations established in this manner can thus be immediately compared to each other and are easily aggregated.

The major drawback in this type of method is that the elements of the rep­resentation do not originate from the subject. So we run the risk of dispossess­ing the subject of part of its representation or even of introducing elements that do not belong to it (Cossette and Audet, 1992).

1.2. Non-structured methods

The purpose of these methods is to generate data that is as natural as possible.

These methods dissociate the data collection phase from the coding and analysis phases.

Interview methods If the researcher wishes to establish the representation of a subject concerning a particular domain, or if there is no existing data about the theme in question, the researcher will collect discourse data from a free or semi­structured interview. These interviews are generally recorded and then retrans­cribed in their entirety in order to then be coded (for more details about this step, see below).

The main advantage of these methods is the validity of the data produced. The data, having usually been generated spontaneously by the respondent or in response to open questions, is more likely to reflect what they really think (Cossette and Audet, 1992). In addition, these methods generate much richer data than do structured methods.

The logical counterpart to these advantages is that these methods reduce the reliability of the data produced. And, insofar as they demand a lot of work on the researcher’s part before the data can be coded, they are not practicable on a large scale. In fact, they are mostly used for in-depth studies of discourse or representations of a small number of subjects (see Cossette and Audet, 1992).

Documentary methods When the researcher has transcriptions of discourse or meetings, or else documents (for example, strategic plans, letters to shareholders, activity reports) at their disposal, they will use the documentary methods.

The main advantage of these methods is that they avoid data reliability problems, as the researcher does not intervene in the data-production process. In addition, these methods do not require any transcription work.

These methods are commonly used to establish the representation or to analyze the organization or group’s discourse.

2. Coding

The coding process consists of breaking down the contents of a discourse or text into units of analysis (words, phrases, themes, etc.) and integrating them into categories which are determined by the purpose of the research.

2.1. Defining the unit of analysis

The unit of analysis is the basic unit for breaking down the discourse or text.

Depending on the chosen method of analysis (content analysis or cognitive mapping) and the purpose of the research, the researcher usually opts for one of the six units of analysis below (Weber, 1990):

  • a word – for example, proper or common nouns, verbs or pronouns
  • the meaning of a word or group of words – certain computer programs can now identify different meanings for the same word or expression
  • whole sentences
  • parts of sentences of the subject/verb/object type. For example, the sen­tence, ‘The price reduction attracts new customers and stymies the compe­tition’, will be divided into two units of analysis: first, ‘The price reduction attracts new customers’, and then, ‘The price reduction stymies the compe­tition’. Identifying this type of unit of analysis, which does not correspond to a precise unit of text (for example, word, sentence) can be relatively tricky
  • one or more paragraphs, or even an entire text. Weber (1990) points out the disadvantages in choosing this type of unit of analysis, in terms of coding reliability. It is much more difficult to come to an agreement on the classifi­cation of a set of phrases than of a word.

2.2. Classifying units of analysis

Once the units of analysis have been pinpointed in the discourse or text, the next step is to place them in categories. A category is a set of units of text. All units of analysis belonging to the same category should have either similar meanings (synonyms like ‘old’ and ‘elderly’, or equivalent connotations like ‘power’ and ‘wealth’) or shared formal characteristics (for example, one cate­gory could be ‘interrogative sentences’, another ‘affirmative sentences’, a third ‘silence’, a fourth ‘active verbs’, another ‘passive verbs’).

The more clear and precise the definitions of the units of analysis and categories are, the more reliable the coding will be. For this reason, it is advisable to establish a protocol specifying the rules and definitions of these elements.

2.3. Coding reliability

The combination of the ambiguity of discourse and the lack of precision in the definitions of categories and coded units or other coding rules makes it neces­sary to check coding reliability.

Reliability can be declined into three more specific subcriteria (Weber, 1990):

  • Stability: this is the extent to which the coding results are the same when the same data is coded by the same coder more than once.
  • Accuracy: this dimension measures the proximity of a text’s classifications to a standard or norm. It is possible to establish this when the standard coding for a text has been elaborated. This type of reliability is rarely evaluated. Nevertheless, it can be useful to establish it when a coding protocol created by another researcher is being used.
  • Replicability (or inter-coder reliability): this criterion refers to the extent to which the coding produces the same results when the same data is coded by different coders. This is the most common method for evaluating coding reliability.

3. Analyzing Data

Analyzing data is equivalent to making inferences based on the characteristics of the message which appeared in the data-coding results. The researcher can decide to analyze more specifically the structure of the representations or their contents, using quantitative or qualitative methods, in order to compare, describe, explain or predict objectives which all require different methods of analysis.

3.1. Analyzing content or structure

Content analysis consists of inferring the signification of the discourse through detailed analysis of the words used, their frequency and their associations. The different modalities of analysis will be described in greater detail later on in this chapter, in relation to the methodology used (content analysis or cognitive mapping).

When analyzing the structure of a text, discourse or representation, the goal is to discover the rules of organization of the words, sentences and themes employed. Analysis of the structure of discourse or representations, although it does not enable us to perceive all of the thought or decision-making processes, does reveal certain cognitive characteristics, such as the subject’s cognitive com­plexity. Structure analysis can be used in particular for explaining or predicting behavior.

3.2. Quantitative or qualitative analysis

After coding, interpreting the text or discourse data can be done with either quantitative or qualitative techniques. Quantitative analyses depend essentially on counting the units of analysis, or on more elaborate statistical analyses. These can be performed with the help of specialized software. Qualitative analyses allow us to interpret the arrangement of these units by placing them in a more global context. These analyses can be based on procedures which are not speci­fic to discourse or text data analysis, such as, for example, seeking the opinions of experts. These judges, who could be the researcher himself or herself, members of the organization under study, the subject interrogated or outside experts, will evaluate the similarities or differences in the coded data in a more global manner.

These quantitative and qualitative analyses are complementary and should be used conjointly for a richer interpretation of the data.

3.3. Describing, comparing, explaining or predicting

Analyzing discourse or representation can be done solely for the purpose of description. In that case, the point is to describe, using the data collected, the contents or structure of a text or representation. The researcher can also attempt to describe the structure of a discourse’s argumentation, or the state of beliefs of an active member of the organization (see Cossette and Audet, 1992).

If the researcher’s goal is to compare the discourse or representations of several different individuals, groups of individuals, or organizations, or to eval­uate their evolution over time, then they will have to reveal the similarities and differences in their contents or structure (see Laukkanen, 1994). The researcher can undertake quantitative1 or qualitative comparative analyses. These methods will be presented further on in this chapter.

The researcher can also attempt to explain and, by extrapolation, predict, certain phenomena or behavior through discourse and representation analysis. For example, exposing important or antagonistic concepts within a representa­tion can testify to their importance for an individual or organization, and there­fore explain some of their behavior or decisions in situations where these concepts are activated (see Komokar, 1994).

After this overview of the general steps in representation and discourse analysis methods, we will present two of these methods more precisely here: content analysis and cognitive mapping.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Content Analysis

Content analyses were developed to study newspaper articles and political speeches in the USA in the 1920s. They are based on the theory that repeating certain elements of discourse (words, expression or similar meanings) revealed the interests or concerns of the persons involved. Their purpose is to analyze the manifest contents of a communication.

The ‘content analysis’ category, includes several different methods which, although they all follow the steps presented in Figure 16.1, differ in terms of the coding units selected and the methods used for analyzing the results.

We will restrict ourselves herein to presenting the most common methods in organizational research, while acknowledging nevertheless that many other methods, such as those applied to newspaper studies, non-verbal communica­tion analysis and linguistic analyses do exist (see, for more details, Robert, 1997; Stubbs, 1983).

1. Collecting Data

Content analysis is performed on data that has been collected according to non­structured or semi-structured methods, such as (free or semi-directive) interviews.

Certain replies to questions in surveys can also be processed in this way. These are usually replies to simple open-ended questions, for example, ‘How do you evaluate your staff’s work?’ More generally, any kind of verbal communication or written material can be processed by content analysis.

2. Coding Data

As for any coding process, the discourse or text is broken down into units of analysis, then classified into the categories defined according to the purpose of the research.

2.1. Defining the units of analysis

There are basically two types of content analyses which can be defined accord­ing to the units of analysis defined:

  • Lexical analyses, which are the more frequently used, examine the nature and range of vocabulary used in the discourse or text and analyze the frequency with which words appear. In this case, words are the unit of analysis.
  • Thematic analyses adopt sentences, portions or groups of sentences as their unit of analysis. This last type is more common in organizational studies (D’Aveni and MacMillan, 1990; Dougherty and Bowman, 1995).

2.2. Defining categories

Depending on the coding unit selected, categories are usually described:

  • Either in the form of a concept that will include words with related mean­ings (for example, the category ‘power’ could include words like strength, force or power). Computer-aided content analyses and their associated dic­tionaries, which automatically assign words to categories, can be used here. They offer several advantages: they reduce the amount of time spent on defining and validating categories, they standardize the classification process, and they facilitate comparisons with other studies.
  • Or in the form of broader themes (for example, competitive strategies) which include words, groups of words or even whole sentences or para­graphs (depending on the unit of analysis defined by the researcher). The main difficulty lies in defining the breadth of selected categories. For example, a category like ‘organizational strategies’ is broader than ‘competitive strate­gies’ or ‘competitiveness factors’. Defining the breadth of the category must be related to both the researcher’s objectives (narrow categories make com­parative analysis more difficult) and the materials used (it is easier to construct narrow categories based on rich, in-depth interviews than on letters to shareholders, which are generally more shallow).
  • In certain cases, the categories may be assimilated to a single word. So there will be as many categories as there are different words that the researcher has decided to study (even if their meaning is similar). In this case, the words competitors and rivals, for example, will constitute two different categories.
  • Finally, the categories can be characteristics of types of discourse, such as pauses, different intonations, grammatical forms or different types of syntax.

Defining categories before or after coding In the a priori method, categories are defined prior to coding – on the basis of experience or the results of earlier research. This method is used when attempting to verify hypotheses arising from other studies. The organizational verbal behavior classification system used by Gioia and Sims (1986) is an excellent example of a priori classification in which the categories stem from earlier research. Boland and Pondy (1986) also relied on a priori classification to code transcriptions of budgetary meet­ings. The categories were defined in terms of the decision model used (fiscal, clinical, political or strategic) and the mode of analyzing the situation (instru­mental or symbolic).

In the ex post method, the categories are defined during the coding process. The choice of categories springs from the contents themselves. The idea is usu­ally to create an inventory of the different themes in the text or discourse being studied. The text must be read and reread several times in order to isolate the essential themes in relation to the purpose of the study. Themes whose impor­tance is underlined by repetition should suggest ideas for categories.

3. Analyzing Data

3.1. Quantitative analysis

Content analysis sprang from a desire for quantification in reaction to literary analysis. The qualitative notion was therefore foreign to its original concerns. So in general, the first step of the analysis is to calculate, for each category, the number and frequency of the units of analysis. Therefore, for each document studied, the number of units of analysis in each category studied is counted in order to deduce the category’s importance. The analyses performed in Boland and Pondy’s (1986) work dealt essentially with word-frequency counts. How­ever, frequency calculation runs into several problems. In the first place, when the categories correspond to single words, they may have different meanings depending on their context (whence the need to combine both quantitative and qualitative analysis). In addition, the use of pronouns, which often are not counted, can bias the frequency analysis if it involves nouns only.

Researchers performing content analysis also have at their disposal various statistical data-analysis techniques, of which factor analysis is the most com­monly used. It enables us, for example, to associate the presence of a greater or lesser number of units in a given category to the presence of a greater or lesser number of units in another category. Other types of analysis, such as regressions, discriminant analyses, and cluster analysis, can also be performed. It is up to each researcher to determine the most appropriate analyses for the purposes of the research. Therefore, to study the relation between managerial attribution and verbal behavior in manager-subordinate interaction during a simulated perfor­mance appraisal, Gioia and Sims (1986) performed content analysis on verbal behavior. This analysis was based particularly on a set of statistical analyses: multivariate analysis of variance, t test and correlation analysis.

3.2. Qualitative analysis

A more qualitative analysis, aimed at judging, rather than measuring, the importance of the themes in the discourse, can also be performed. The differ­ence between quantitative and qualitative analysis lies in the way they perceive the notion of importance for a category: ‘how often’ for quantitative analysis or ‘theme value’ for qualitative analysis. Qualitative analysis tries to interpret the presence or absence of a given category, taking into account the context in which the discourse was produced (which can explain the presence or absence of cer­tain categories). Qualitative analysis also allows for a more refined approach, studying the units of analysis in their context in order to understand how they are used (with which other units of analysis do they appear or are they associ­ated to in the text?).

Qualitative analysis enables the researcher to go beyond simple content analysis of a discourse or document. It enables us to formalize the relations between the different themes contained in a communication in order to reveal its structure. Thanks to content analysis, it is equally possible to study the con­tents or the structure of a discourse or document.

Finally, content analysis can be used for descriptive, comparative or explana­tory ends. Content analysis allows us to go beyond plain description of the con­tents of a communication and to discover the reasons for certain strategies or behaviors. By revealing the importance of certain themes in the discourse, con­tent analysis can lead to explanations for the behaviors or strategies of the authors of the discourse analyzed. It is also possible to make certain unrecogni­zed variables or influence factors appear, or to reveal relations between different organizational behaviors and different concerns of the organization’s leaders. By analyzing the contents of letters to shareholders, the work of D’Aveni and MacMillan (1990) succeeded in revealing the relation between the decision­makers’ main points of interest (focused on the internal or external environment) and the companies’ capacity to weather a crisis.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Cognitive Mapping for Analyzing Representations and Discourse

As stated earlier, cognitive mapping is aimed at establishing and analyzing cognitive maps, that is, representations of a person or organization’s beliefs concerning a particular domain (Axelrod, 1976). A cognitive map is made up of two types of elements:

  • Concepts, also known as constructs or variables.
  • The links between these concepts. These links can be based on similarity (that is, category schemes; see Rosch, 1978), contiguity, resemblance, influ­ence (that is, influence maps; see Cossette and Audet, 1992), causality between concepts (that is, causal maps; see Huff, 1990) or all of these at once (that is, cognitive maps, Bougon 1983, although certain authors use the term when referring to maps that indicate only causal or influence links between variables).

1. Collecting Data

1.1. Structured or a priori methods

The researcher chooses a set of variables (generally around ten) which he or she considers to be applicable to defining the domain that the map will represent, either based on existing research (Ford and Hegarty, 1984) or on preliminary interviews (Bougon, 1983). Then the links are collected from the respondents by having them consider each pair of variables. For example, does variable A influence variable B? If so, is the influence positive or negative? These links are collected by presenting the respondent with a matrix that cross-references all the variables (Ford and Hegarty, 1984), or pairs of cards with the variables to consider written on them (Bougon, 1983).

Because of the lack of validity of such methods, researchers may prefer to use more natural methods.

1.2. Non-structured methods

When dealing with non-structured methods, as we have seen, data is collected through documents or more open interviews.

Interview methods Researchers collect discourse data through in-depth inter­views which are recorded and then entirely transcribed before coding (see, for example, Cossette, 1994). Because of the amount of pre-collection work that these methods entail, some researchers have developed semi-directive methods for cognitive mapping (Ackermann et al., 1992; Eden et al., 1983; Laukkanen, 1989; 1992). The subject is asked about the causes, effects and consequences of the chosen theme for the interview (which will be the central variable on the map). This reveals new variables which are noted down and which can then be submitted to a similar set of questions, and so on and so forth until the person reaches saturation point – that is, has nothing more to say. Thus the map is built interactively with the respondents, who can see it and therefore reflect on their own vision. This method is commonly used to help managers make decisions and formulate strategic problems (Ackermann et al., 1992; Cossette and Audet, 1992). The COPE software developed by Eden (1988; 1990) was in fact con­ceived essentially with this goal in mind. The main advantage of this method is that it eliminates the need for the researcher to do a lot of coding work. On the other hand, it reduces the possibility of comparing different subjects maps, as they are no longer structured by the researcher beforehand. In addition, since this method implies showing the map to the respondents as they go along, it restricts the validity of the collected data.

Documentary methods The other method available to the researcher is to establish a map based on written documents or interview transcriptions. These methods are mainly used in management for establishing representations of organizations or groups of managers. In these methods, the interview trans­criptions or documents collected must be coded by the researcher.

2. Coding Data

2.1. Defining units of analysis

As a cognitive map is a representation composed of concepts and the links that relate them to each other, all of the statements that contain this type of relation should be identified in the text. The type of statement a researcher is looking for will depend on the type of map he wishes to make. We may seek:

  • statements containing links of influence (of the ‘A affects, encourages, pre­vents B’ kind); to construct an influence map
  • statements containing causal links (of the ‘A causes B, if A then B’ kind); to establish a causal map
  • statements containing similarity links (A is like or unlike B), or hierarchy links (A is included in B, A is an example of B); to construct category schemes
  • or all of these links; to construct a cognitive map (Bougon, 1983).

The unit of analysis in cognitive mapping is, therefore, a statement of the ‘concept A/link/concept B’ type (Axelrod, 1976). These statements generally correspond to a sentence like ‘cost-effectiveness encourages corporate growth’ (influence relationship), ‘if cost-effectiveness increases, then corporate growth is encouraged’ (causal relationship), ‘quality concerns are often similar to job- safety problems’ (similarity relationship). Some statements, however, can be spread out over several sentences: ‘The employees are on strike. We won’t be able to deliver to our customers on schedule’ (causal relationship between the two sentences). Since units of analysis do not necessarily correspond to a pre­cise unit of text (for example, sentences), evaluating the reliability of their defi­nition is highly recommended (Axelrod, 1976). The agreement rate between two coders will be calculated in terms of which elements they both consider to be codable in the text (see Robinson, 1957).

2.2. Defining categories

Identifying concepts Once all the statements have been identified, the researcher will attempt to locate within these statements the elements that the speaker considers to be links or influencing or influenced concepts (causes and effects, means and consequences). To facilitate identification of cause and effect elements (or influencing and influenced factors), Wrightson (1976) and Huff (1990) advise asking the following questions:

  • ‘Which came first, A or B?’
  • ‘Does A logically precede B?’
  • ‘Does A necessarily precede B?’

A being the supposed causal concept and B being the supposed effect.

When coding statements in a text, the general rule is not to modify their meaning. Variables are generally preserved in their literal form. However, in order to give the map a more dynamic aspect, it is advisable to transform nomi­nal propositions expressing actions into the corresponding verb (Ackermann et al., 1992). So the variable ‘Increase in the promotional budget’ would be trans­formed into ‘Increasing the promotional budget’. In addition, concepts should be expressed in the form of a variable, which can require modifying them slightly. For example, the variable ‘product listing’ should be expressed as ‘quality or degree of product listing’.

Identifying links The next step is to search for the nature of the link relating the concepts that have been identified. These links can usually be identified by verbs (such as implies, leads to, prevents, etc.). For causal or influence maps, we generally look for the following links:

  • Positive influence or causal links, (graded /+/): leads to, causes, has as a consequence, increases, enables . . .
  • Negative influence or causal links, (graded /-/): prevents, harms, damages, reduces, is harmful to, gets in the way of, decreases, diminishes, restricts . . .
  • Non-influence links, (graded /0/): has no effect on, is not tied to …
  • Positive non-implication influence links, (graded /0 +/): does not entail, does not increase, does not allow/enable, does not lead to …
  • Negative non-implication influence links, (graded /0 – / ): does not prevent, does not harm . . .

Besides these categories for qualifying relations of influence between iden­tified variables precisely, Huff (1990) adds a certain number of coding rules which, although their goal is not to code relations of influence per se, do facili­tate a later merging of the variables involved in different relations. These rules enable us, in fact, to identify the relations of ‘definition’ – in the broadest sense – expressed by subjects between different variables: particularly relations between examples illustrating a variable’s level (for example, ‘Machine X is a dangerous machine’), and relations of equality (for example, ‘Competition means rivalry between conductors’). These categories allow us to detect connotative links between variables, thereby facilitating their later merging. However, taking these links into account makes data coding more complicated and weighs down the graphic representation that will be established.

The ambiguity of data discourse makes these coding operations difficult. Identifying variables and the links between them is no easy task, and certain links of influence (for example, contingent, interactive or reversible causal relations) have proven themselves to be extremely difficult to handle using the classic rules of coding (Cossette and Audet, 1992). Processing choices made in these cases (not coding them, creating specific coding rules, assimilating them to other relationships) must be specified.

Once the concepts and links in the statements that are considered codable have been identified, we have a list of relationships at our disposal. The next step is to combine the synonymous and redundant variables and influence links.

2.3. Merging similar concepts and variables

This phase involves deciding which variables the researcher considers as simi­lar or synonymous. While the general rule is, when in doubt, leave the vari­ables the way the interviewee expressed them, several guidelines can be of help during this phase (Wrightson, 1976):

  • If a variable is mentioned several times by a single respondent, then it is very likely that a modified expression of this variable is nothing more than a stylistic variation. We can then merge the variable and its modified expres­sion, unless the respondent had explicitly specified a distinction between the two.
  • If a variable appears to be an example or a part of a more general one, then the two can be merged, as long as they are both expressed by the same person.
  • The basic rule underlying this operation is to ask oneself if the respondent’s comments would be fundamentally modified if the merging were carried out.

By combining similar concepts and variables, links can be made between seemingly unrelated variables, which is of great assistance in drawing the inter­viewee’s cognitive map. In order to facilitate data coding, it is a good idea to make a list of concepts and their corresponding merged terms.

Example: Coding data to establish an influence map

Say we seek to code the following paragraph:

‘Increasing our publicity budget and the efforts on the part of our salesforce enabled us to improve our product listing in mass distribution. So our market share improved considerably, moving us from fifth to second place in our market.’

    1. Identify codable statements

The paragraph contains four codable statements: two in the first sentence (increasing our publicity budget enabled us to improve our product listing in mass distribution; the efforts on the part of our sales force enabled us to improve our product listing in mass distribution), one connecting the first and the second sentences (improving our product listing in mass distribution lead to a considerable improvement in our market share), and one in the last sentence (our market share improved consider­ably, going from fifth to second place in our market).

    1. Specify the influencing factors/link/influenced factors

The statement ‘increasing our publicity budget enabled us to improve our product listing in mass distribution’ includes an influencing variable (increasing our publi­city budget), an influenced variable (our product listing in mass distribution), and a positive influence relationship (enabled us to improve). The following example shows how the statement is coded:

Publicity budget  /+/ (Quality or degree of) product listing

This is how the other statements are coded:

Our salesforce’s efforts /+/ (Quality or degree of) product listing (towards distributors)

(Quality or degree of) /+/ Market share product listing

Market share       /+/ Going from fifth to second place in our market

    1. Merging synonymous variables

As the coded variables are different from each other, there are no merging opera­tions to do. If, instead of the second sentence, we had had, ‘Better store placement enabled us to improve our market share considerably, going from fifth to second place’, we would undoubtedly have merged the variables ‘degree of product list­ing’ and ‘store placement’ and kept only one of them.

2.4. Coding reliability

Coding then follows the procedure described above (see Subsection 1.2). Inter­coder reliability must be established: (1) for the units identified as codable, (2) for the classification of these units, (3) for merging decisions (see Axelrod, 1976, for calculating reliability rates).

Once the set of relations has been identified and merged, they can be repre­sented graphically (with knots representing the concepts, and arrows the links between them).

2.5. Graphic representations of cognitive maps

Graphic representations can be arranged in different ways. Axelrod (1976) put the factual variables (environmental data, political options) on the left and the goals and consequences on the right (see Figure 16.2). Bougon et al. (1977) draw a map with the variables in a circle, so the final representation looks like a spider web. Eden et al. (1992) arrange them from bottom to top (from the means to the goals). Cossette (1994), using the COPE software, arranges the map in such a way that the distance separating variables that are linked through direct influence is as small as possible.

For someone using a priori methods, a graphic representation is less useful since the maps are subjected essentially to quantitative analysis. They are, therefore, generally left in the form of a matrix.

The matrix in Figure 16.3 corresponds to the map in Figure 16.2. This matrix cross-references the set of n variables included in the map with themselves. Cell ij indicates the absence – value 0 – or presence – non-0 value – of a direct- influence link of the variable in line i on the variable in column j, as well as the polarity of this influence (positive, + 1; or negative, -1).

3. Analyzing the Data

3.1. Analyzing the structure of cognitive maps

Two main types of methods can be distinguished here: (1) methods which evaluate the subjects’ cognitive complexity via general quantitative indicators, and (2) methods which reveal the structural dimensions upon which the sub­jects have organized their representations.

Complexity indicators In order to apprehend the structure of a cognitive map, one can calculate a set of general complexity indicators:

  • The number of variables (Weick and Bougon, 1986) and clusters (Cossette and Audet, 1992; Eden et al., 1992) included in the maps, which are indica­tors of the degree of differentiation between their constitutive elements.
  • The number of links (Eden et al., 1992) and loops (Axelrod, 1976; Cossette and Audet, 1992; Eden et al., 1992; Weick and Bougon, 1986), which are indi­cators of their degree of interconnection.

These indicators depend largely on the method of establishing influence maps, and particularly on the degree of merging operated between synony­mous variables and influence links.

Methods for analyzing the organization of cognitive maps Along with these measurements, there is a set of methods that are more specifically intended to apprehend the organization of the subjects’ representations. We can start by looking for the ‘central’ concepts in the map, that is, the concepts which are either strongly influenced (in which case they are taken as goals to achieve or consequences) or strongly influencing (seen as reasons for the phenomena described in the map or means of acting upon them; Cossette and Audet, 1992). A concept’s centrality is generally defined by the number of direct links which it influences or is influenced by (Hart, 1976), or else by the total number of direct and indirect links influencing or influenced by it, weighted by the average length of the paths linking the factor under consideration to the others (Eden et al., 1992).

The general organization of a map can also be apprehended through cluster analysis. The particular dimensions around which subjects organize their representations are determined in this manner.

Eden’s COPE software enables us to do all of these analyses. The advantage of these methods is that they do not require a priori structuring of the maps or standardization of the variables and links found within them (Daniels et al., 1993), while still enabling comparison between maps from different subjects or from the same subject at different times. On the other hand, they only examine the maps’ organization, and not their content.

3.2. Analyzing the content of cognitive maps

The main purpose of the different methods for analyzing the contents of repre­sentations is to compare representations collected from a single subject at dif­ferent times or from a group of subjects at the same time.

Global methods of comparing the content of cognitive maps Global methods for comparing the contents of individual representations establish general indi­cators of content similarity based on the complete set of variables and links the representations contain. Within these methods, we generally distinguish between measures of distance and of similarity.

Distance measures Distance measures are generally based on the Euclidean notion of distance. Nevertheless, due to a desire to compare the measurements obtained and to apply these ratios to different types of maps, these measurements quickly became more complex – through weighting, taking into account the strength of the links, the possible number of polarities in the influence links, the nature of the variables – receptive or transmitting – the unique nature of certain types of variables, the strength of the beliefs relative to the existence of links and so on (Langfield-Smith and Wirth, 1992; Markoczy and Goldberg, 1993). Despite the increase in the formulae’s complexity and adaptability, these meth­ods still require some structuring of the maps by the researcher: either through a priori collection methods or through ex post merging of the variables and links used by the subjects (Daniels et al., 1993). They thus reduce the validity of the individual maps; and are not very well-adapted to maps generated by non­structured methods.

Similarity measures To establish similarity measurements, we first need to find the number of points in common between the different spaces to be com­pared in terms of links or variables, which can be weighted by the number of common and different points available (see, for example, the Jaccard or Dice indices; see Jaccard et al., 1990). Like distance measures, these ratios require a certain amount of standardization of the links and variables. However, unlike distance measures, the researcher can perform these operations ex post. These measures also have the advantage of being founded on dimensions shared by the maps that are being compared – the results obtained are then easier to interpret.

Aside from these mathematical evaluations, Daniels et al. (1993) suggest asking independent judges to evaluate the degree of similarity of pairs of maps. The average evaluation established in this manner will be retained as an indi­cator of their similarity. Unlike the preceding mathematical methods, turning to qualitative methods has the following advantages:

  • It enables comparison between entirely idiosyncratic maps. Neither a priori nor ex post structuring is required.
  • It is not contingent to a particular type of map.

These characteristics undoubtedly endow these methods with greater valid­ity than mathematical measurements. Unfortunately, the size of the maps to be compared poses serious problems. Although it is easy to use qualitative evalu­ation for maps containing 10 to 50 variables, it becomes much less so when evaluating similarity for idiosyncratic maps that can contain 100 to 200 con­cepts. It can, therefore, be useful to focus analysis on a part of a cognitive map, rather than on the whole.

Local methods for analyzing the content of cognitive maps This is one of the purposes of domain analysis. Domain analysis was developed by Laukkanen (1989; 1994) for comparing maps from different subjects or groups thereof. These analyses bear on sub-maps made up of factors that are to some extent directly influenced by, or influence, a variable that particularly interests the researcher. The CMAP2++ software developed by Laukkanen automatically generates to this effect databases made up of all the links and variables included in the sub­maps in which that variable is inserted. These lists of links and variables are easy to represent graphically and to analyze qualitatively (see Laukkanen, 1994) by, for example, collating the chains of factors influencing or influenced by a variable included in different subjects’ sub-maps.

These analyses, bearing on the structure and contents of cognitive maps, can serve different research purposes. Up until now, most research concerned map analysis as such (see Cossette and Audet, 1992) or the elaboration of analysis methods (see Laukkanen, 1992; Eden et al., 1979; 1983). But cognitive maps can also serve to explain or to predict other behavioral variables or organizational phenomena (Komokar, 1994; Axelrod, 1976) and to assist in decision making (Eden et al., 1983; Eden, 1988; 1990; Eden and Banville, 1994).

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.

Communication Media for the Research

Communication of research work can be directed towards various readers. Research on organizations may be of interest to three particular kinds of readers: researchers, managers and other members of the organization, and, to a lesser extent, the general public. The same research project may then need to be adapted to different target audiences – and for this the text will need to be encoded. This encoding is the subject of the first part of this section, while the second part identifies the requirements of different audiences. The third and final part will examine the particular case of communication media destined for a readership made up of other researchers.

1. Encoding

Encoding is the process of producing a message according to an appropriate system of signs – a code. This message will then be decoded by the recipients, as long as they share, or know, the code used. If the message is to be under­stood, it is important it conforms strictly to the code.

What concerns us particularly is that the code has a bearing on the form of the text, and indicates to the readers what type of message it is: research, sum­mary, popularization, etc. For this reason, in encoding a text for a scientific pur­pose researchers generally include a section on methodology, direct the reader to appropriate references and avoid what Daft (1995) calls an amateur style and tone – such as exaggerating the importance of the project, over-insistence (for example by using exclamation marks) or expressing an overly negative attitude towards previous studies.

Such encoding might seem constraining to the writer of the text, in so far as the form of the text is to a large degree imposed from above. If one considers, however, the ‘Instructions to Contributors’ provided by all academic journals, the importance of encoding is tangible. A text that disregards this encoding will be seen as deviating from the norm, and it will be more difficult to have it accepted for publication. In some cases an article can be rejected without thorough evaluation, simply because its form betrays the fact that the author has not tried to adapt the work to the journal’s editorial policy. However, the strictness to which certain criteria are applied depends very much on the people at the top of the system: the gatekeepers (the publication’s editor and reviewers).

Although the code is generally shared across different publication media within the scientific genre, particular publications often have their own lesser conventions that should also be respected. For example, Administrative Science Quarterly and the Academy of Management Journal share the same scientific code, but one can easily tell, on reading an article, from which of these journals it has been taken.

2. Target Readerships

The three kinds of possible readerships can be classified according to their exposure to different forms of communication. First, researchers form the principal audience. Managers may be interested, to a lesser degree, in accor­dance with the theme of the study or the industry being studied, and particu­larly in the application aspects of the research. Finally, it is only rarely that the general public has any contact with research. For these reasons, we will not broach here the question of adapting research to a very large readership. Richardson (1990), however, provides a good analysis of an example of adapta­tion for the general public.

Research can be communicated through five general types of publications: research articles, managerial articles, books, conference papers and reports. To these can be added, although generally only once in the life of a researcher, a doctoral dissertation. There is an amount of consistency between the two principal readerships – researchers and managers – and the different publica­tion media, as is shown in Table 17.1.

Other researchers are interested in most publication media, but their expect­ations can vary in line with the medium in which they are seeking information. They may look for specific results in an academic journal, but use a manage­ment journal to find examples to use in illustrating a course. Unfortunately, managers are less concerned with research, apart from, of course, applied stud­ies they have commissioned, and which are to be the subject of a report. Managers can, however, ensure the success of some books as far as sales are concerned, or turn to practitioner-oriented journals to keep themselves informed.

As these different readerships have different expectations, the content of the text must be adapted accordingly, keeping within the encoding norms appropriate for the intended medium. Table 17.2 presents encoding rules as a function of the intended readership.

It is sometimes difficult to apply these encoding rules, as the same text can have various readerships.

As an example of the adaptation of research articles for a public consisting of managers, The Academy of Management Executive has since 1993 provided ‘research translations’ – presentations of research articles that have appeared in other journals. These presentations are often much (two pages) shorter than the original article, are not referenced and are written in a direct style, with empha­sis on the implications of the research.

3. Media for Researchers

Before we look at research articles, the principal medium in which research work is written, let us look briefly at the various possible means of communi­cating with colleagues in the field.

We can distinguish between oral communication methods, such as confer­ence presentations, and written methods, such as articles. Within these two groups we can also distinguish works based on data from those that are purely conceptual.

Conference presentations generally do not go beyond 20 minutes. It is important to observe this time limit, so as to leave room for discussion with the audience. Speakers must therefore plan their presentation carefully before­hand, and select the points to develop. One recommends a quick presentation of the fundamental theoretical concepts and the main lines of development of the research. The accent should always be placed on the results and on their contributions to the field, both practical and theoretical.

As for written media, the differences between them depend more on the presence of data than on the type of document. A conceptual article or a literature review will necessarily be presented differently than an empirical article. Methodology and results will be absent, with the emphasis placed on theoretical implications and on future research. To see the differences between empirical and non-empirical articles, I refer you to the Academy of Management Journal, which publishes only empirical articles, and to the Academy of Management Review, which publishes only conceptual articles.

Source: Thietart Raymond-Alain et al. (2001), Doing Management Research: A Comprehensive Guide, SAGE Publications Ltd; 1 edition.