Normal Probability Distribution

The most commonly used probability distribution for describing a continuous random variable is the normal probability distribution. The normal distribution has been used in a wide variety of practical applications in which the random variables are heights and weights of people, test scores, scientific measurements, amounts of rainfall, and other similar values. It is also widely used in statistical inference, which is the major topic of the remainder of this book. In such applications, the normal distribution provides a description of the likely results obtained through sampling.

1. Normal Curve

The form, or shape, of the normal distribution is illustrated by the bell-shaped normal curve in Figure 6.3. The probability density function that defines the bell-shaped curve of the normal distribution follows.

We make several observations about the characteristics of the normal distribution.

  1. The entire family of normal distributions is differentiated by two parameters: the mean m and the standard deviation s.
  2. The highest point on the normal curve is at the mean, which is also the median and mode of the distribution.
  3. The mean of the distribution can be any numerical value: negative, zero, or posi­tive. Three normal distributions with the same standard deviation but three different means (-10, 0, and 20) are shown in Figure 6.4.
  4. The normal distribution is symmetric, with the shape of the normal curve to the left of the mean a mirror image of the shape of the normal curve to the right of the mean. The tails of the normal curve extend to infinity in both directions and theoret­ically never touch the horizontal axis. Because it is symmetric, the normal distribu­tion is not skewed; its skewness measure is zero.
  5. The standard deviation determines how flat and wide the normal curve is. Larger values of the standard deviation result in wider, flatter curves, showing more vari­ability in the data. Two normal distributions with the same mean but with different standard deviations are shown in Figure 6.5.
  6. Probabilities for the normal random variable are given by areas under the normal curve. The total area under the curve for the normal distribution is 1. Because the distribution is symmetric, the area under the curve to the left of the mean is .50 and the area under the curve to the right of the mean is .50.
  1. The percentage of values in some commonly used intervals are
    • 3% of the values of a normal random variable are within plus or minus one standard deviation of its mean.
    • 4% of the values of a normal random variable are within plus or minus two standard deviations of its mean.
    • 7% of the values of a normal random variable are within plus or minus three standard deviations of its mean.

2. Standard Normal Probability Distribution

A random variable that has a normal distribution with a mean of zero and a standard de­viation of one is said to have a standard normal probability distribution. The letter z is commonly used to designate this particular normal random variable. Figure 6.7 is the graph of the standard normal distribution. It has the same general appearance as other normal distributions, but with the special properties of μ = 0 and σ = 1.

Because μ = 0 and σ = 1, the formula for the standard normal probability density function is a simpler version of equation (6.2).

As with other continuous random variables, probability calculations with any nor­mal distribution are made by computing areas under the graph of the probability density function. Thus, to find the probability that a normal random variable is within any specific interval, we must compute the area under the normal curve over that interval.

For the standard normal distribution, areas under the normal curve have been computed and are available in tables that can be used to compute probabilities. Such a table appears inside the front cover of the this text, and as part of Appendix B in the digital version. The table on the left-hand page contains areas, or cumulative probabilities, for z values less than or equal to the mean of zero. The table on the right-hand page contains areas, or cumulative probabilities, for z values greater than or equal to the mean of zero.

The three types of probabilities we need to compute include (1) the probability that the stand­ard normal random variable z will be less than or equal to a given value; (2) the probability that z will be between two given values; and (3) the probability that z will be greater than or equal to a given value. To see how the cumulative probability table for the standard normal distribution can be used to compute these three types of probabilities, let us consider some examples.

We start by showing how to compute the probability that z is less than or equal to 1.00; that is, P(z < 1.00). This cumulative probability is the area under the normal curve to the left of z = 1.00 in Figure 6.8.

Refer to the standard normal portability table. The cumulative probability correspond­ing to z = 1.00 is the table value located at the intersection of the row labeled 1.0 and the column labeled .00. First we find 1.0 in the left column of the table and then find .00 in the top row of the table. By looking in the body of the table, we find that the 1.0 row and the .00 column intersect at the value of .8413; thus, P(z < 1.00) = .8413. The following excerpt from the probability table shows these steps.

To illustrate the second type of probability calculation we show how to compute the probability that z is in the interval between -.50 and 1.25; that is, P(-.50 < z < 1.25). Figure 6.9 shows this area, or probability.

Three steps are required to compute this probability. First, we find the area under the normal curve to the left of z = 1.25. Second, we find the area under the normal curve to the left of z = -.50. Finally, we subtract the area to the left of z = -.50 from the area to the left of z = 1.25 to find P(-.50 < z < 1.25).

To find the area under the normal curve to the left of z = 1.25, we first locate the 1.2 row in the standard normal probability table and then move across to the .05 column. Because the table value in the 1.2 row and the .05 column is .8944, P(z < 1.25) = .8944. Similarly, to find the area under the curve to the left of z = – .50, we use the left-hand page of the table to locate the table value in the – .5 row and the .00 column; with a table value of .3085, P(z < -.50) = .3085. Thus, P(-.50 < z < 1.25) = P(z < 1.25) – P(z < -.50) = .8944 – .3085 = .5859.

Let us consider another example of computing the probability that z is in the inter­val between two given values. Often it is of interest to compute the probability that a normal random variable assumes a value within a certain number of standard deviations of the mean. Suppose we want to compute the probability that the standard normal random variable is within one standard deviation of the mean; that is, P(-1.00 < z < 1.00). To compute this probability we must find the area under the curve between -1.00 and 1.00. Earlier we found that P(z < 1.00) = .8413. Referring again to the table inside the front cover of the book, we find that the area under the curve to the left of z = -1.00 is .1587, so P(z < -1.00) = .1587. Therefore, P(-1.00 < z < 1.00) = P(z < 1.00) – P(z < -1.00)

= .8413 – .1587 = .6826. This probability is shown graphically in Figure 6.10.

To illustrate how to make the third type of probability computation, suppose we want to compute the probability of obtaining a z value of at least 1.58; that is, P(z > 1.58).

The value in the z = 1.5 row and the .08 column of the cumulative normal table is .9429; thus, P(z < 1.58) = .9429. However, because the total area under the normal curve is 1, P(z > 1.58) = 1 – .9429 = .0571. This probability is shown in Figure 6.11.

In the preceding illustrations, we showed how to compute probabilities given specified z values. In some situations, we are given a probability and are interested in working backward to find the corresponding z value. Suppose we want to find a z value such that the probability of obtaining a larger z value is .10. Figure 6.12 shows this situation graphically.

This problem is the inverse of those in the preceding examples. Previously, we specified the z value of interest and then found the corresponding probability, or area. In this ex­ample, we are given the probability, or area, and asked to find the corresponding z value. To do so, we use the standard normal probability table somewhat differently.

Recall that the standard normal probability table gives the area under the curve to the left of a particular z value. We have been given the information that the area in the upper tail of the curve is .10. Hence, the area under the curve to the left of the unknown z value must equal .9000. Scanning the body of the table, we find .8997 is the cumulative probability value closest to .9000. The section of the table providing this result follows.

Reading the z value from the left-most column and the top row of the table, we find that the corresponding z value is 1.28. Thus, an area of approximately .9000 (actually .8997) will be to the left of z = 1.28.2 In terms of the question originally asked, there is an approx­imately .10 probability of a z value larger than 1.28.

The examples illustrate that the table of cumulative probabilities for the standard normal probability distribution can be used to find probabilities associated with values of the standard normal random variable z. Two types of questions can be asked. The first type of question specifies a value, or values, for z and asks us to use the table to determine the corresponding areas or probabilities. The second type of question provides an area, or probability, and asks us to use the table to determine the corresponding z value. Thus, we need to be flexible in using the standard normal probability table to answer the desired probability question. In most cases, sketching a graph of the standard normal probability distribution and shading the appropriate area will help to visualize the situation and aid in determining the correct answer.

3. Computing Probabilities for Any Normal Probability Distribution

The reason for discussing the standard normal distribution so extensively is that probabil­ities for all normal distributions can be computed using the standard normal distribution. That is, when we have a normal distribution with any mean m and any standard deviation s, we can answer probability questions about the distribution by first converting to the standard normal distribution. Then we can use the standard normal probability table and the appropriate z values to find the desired probabilities. The formula used to convert any normal random variable x with mean m and standard deviation s to the standard normal random variable z follows.

A value of x equal to its mean μ results in z = (μ – μ)/s = 0. Thus, we see that a value of x equal to its mean m corresponds to z = 0. Now suppose that x is one standard deviation above its mean; that is, x = μ + s. Applying equation (6.3), we see that the corresponding z value is z = [(μ + s) – μ]/s = s/s = 1. Thus, an x value that is one standard deviation above its mean corresponds to z = 1. In other words, we can interpret z as the number of standard deviations that the normal random variable x is from its mean μ.

To see how this conversion enables us to compute probabilities for any normal distribu­tion, suppose we have a normal distribution with m = 10 and s = 2. What is the proba­bility that the random variable x is between 10 and 14? Using equation (6.3), we see that at x = 10, z = (x – m)/s = (10 – 10)/2 = 0 and that at x = 14, z = (14 – 10)/2 = 4/2 = 2. Thus, the answer to our question about the probability of x being between 10 and 14 is given by the equivalent probability that z is between 0 and 2 for the standard normal distribution. In other words, the probability that we are seeking is the probability that the random variable x is between its mean and two standard deviations above the mean. Using z = 2.00 and the standard normal probability table inside the front cover of the text, we see that P(z < 2) = .9772. Because P(z < 0) = .5000, we can compute P(.00 < z < 2.00) = P(z < 2) – P(z < 0) = .9772 – .5000 = .4772. Hence the probability that x is between 10 and 14 is .4772.

4. Grear Tire Company Problem

We turn now to an application of the normal probability distribution. Suppose the Grear Tire Company developed a new steel-belted radial tire to be sold through a national chain of discount stores. Because the tire is a new product, Grear’s managers believe that the mileage guarantee offered with the tire will be an important factor in the acceptance of the product. Before finalizing the tire mileage guarantee policy, Grear’s managers want prob­ability information about x = number of miles the tires will last.

From actual road tests with the tires, Grear’s engineering group estimated that the mean tire mileage is m = 36,500 miles and that the standard deviation is s = 5000. In addition, the data collected indicate that a normal distribution is a reasonable assumption. What percentage of the tires can be expected to last more than 40,000 miles? In other words, what is the probability that the tire mileage, x, will exceed 40,000? This question can be answered by finding the area of the darkly shaded region in Figure 6.13.

At x = 40,000, we have

Refer now to the bottom of Figure 6.13. We see that a value of x = 40,000 on the Grear Tire normal distribution corresponds to a value of z = .70 on the standard normal distribu­tion. Using the standard normal probability table, we see that the area under the standard normal curve to the left of z = .70 is .7580. Thus, 1.000 – .7580 = .2420 is the probability that z will exceed .70 and hence x will exceed 40,000. We can conclude that about 24.2% of the tires will exceed 40,000 in mileage.

Let us now assume that Grear is considering a guarantee that will provide a discount on replacement tires if the original tires do not provide the guaranteed mileage. What should the guarantee mileage be if Grear wants no more than 10% of the tires to be eligible for the discount guarantee? This question is interpreted graphically in Figure 6.14.

According to Figure 6.14, the area under the curve to the left of the unknown guar­antee mileage must be .10. So, we must first find the z value that cuts off an area of .10 in the left tail of a standard normal distribution. Using the standard normal probability table, we see that z = —1.28 cuts off an area of .10 in the lower tail. Hence, z = —1.28 is the value of the standard normal random variable corresponding to the desired mileage guarantee on the Grear Tire normal distribution. To find the value of x corresponding to z = —1.28, we have

Thus, a guarantee of 30,100 miles will meet the requirement that approximately 10% of the tires will be eligible for the guarantee. Perhaps, with this information, the firm will set its tire mileage guarantee at 30,000 miles.

Again, we see the important role that probability distributions play in providing decision-making information. Namely, once a probability distribution is established for a particular application, it can be used to obtain probability information about the problem. Probability does not make a decision recommendation directly, but it provides information that helps the decision maker better understand the risks and uncertainties associated with the problem. Ultimately, this information may assist the decision maker in reaching a good decision.

Source:  Anderson David R., Sweeney Dennis J., Williams Thomas A. (2019), Statistics for Business & Economics, Cengage Learning; 14th edition.

Leave a Reply

Your email address will not be published. Required fields are marked *