Next Page Previous Page Home Tools & Aids Search Handbook
6. Process or Product Monitoring and Control
6.5. Tutorials

6.5.1.

What do we mean by "Normal" data?

The Normal distribution model "Normal" data are data that are drawn (come from) a population that has a normal distribution. This distribution is inarguably the most important and the most frequently used distribution in both the theory and application of statistics. If X is a normal random variable, then the probability distribution of X is
Normal probability distribution
f(x) = [1/(sigma*SQRT(2*PI))]*EXP[(-1/2)*((x-mu)/sigma)**2
   -infinity < x < infinity
Parameters of normal distribution The parameters of the normal distribution are the mean mu and the standard deviation sigma (or the variance sigma2). A special notation is employed to indicate that X is normally distributed with these parameters, namely
    X ~ N( mu, sigma) or X ~ N( mu, sigma 2).
Shape is symmetric and unimodal The shape of the normal distribution is symmetric and unimodal. It is called the bell-shaped or Gaussian distribution after its inventor, Gauss (although De Moivre also deserves credit).

The visual appearance is given below.

Sample plot of the normal distribution

Property of probability distributions is that area under curve equals one A property of a special class of non-negative functions, called probability distributions, is that the area under the curve equals unity. One finds the area under any portion of the curve by integrating the distribution between the specified limits. The area under the bell-shaped curve of the normal distribution can be shown to be equal to 1, and therefore the normal distribution is a probability distribution.
Interpretation of sigma There is a simple interpretation of sigma

68.27% of the population fall between mu +/- 1 sigma
95.45% of the population fall between mu +/- 2 sigma
99.73% of the population fall between mu +/- 3 sigma
The cumulative normal distribution The cumulative normal distribution is defined as the probability that the normal variate is less than or equal to some value v, or
    P{X <= v} = F(v) =
 INTEGRAL[-infinity to v]
 [(1/(sigma*SQRT(2*PI)))*EXP((-1/2)*((x-mu)/sigma)**2) dx]
Unfortunately this integral cannot be evaluated in closed form and one has to resort to numerical methods. But even so, tables for all possible values of mu and sigma would be required. A change of variables rescues the situation. We let
    z = (x - mu)/sigma

Now the evaluation can be made independently of mu and sigma; that is,

    P{X <= v} = P{z <= (v-mu)/sigma} = PHI{(v-mu)/sigma}
where PHI(.) is the cumulative distribution function of the standard normal distribution (mu = 0, sigma = 1).
    phi(z) (1/SQRT(2*PI))*EXP(-x**2/2)
Tables for the cumulative standard normal distribution Tables of the cumulative standard normal distribution are given in every statistics textbook and in the handbook. A rich variety of approximations can be found in the literature on numerical methods.

For example, if mu = 0 and sigma = 1 then the area under the curve from mu - 1sigma to mu + 1sigma is the area from 0 - 1 to 0 + 1, which is 0.6827. Since most standard normal tables give area to the left of the lookup value, they will have for z = 1 an area of .8413 and for z = -1 an area of .1587. By subtraction we obtain the area between -1 and +1 to be .8413 - .1587 = .6826.

Home Tools & Aids Search Handbook Previous Page Next Page