Product and Process Comparisons
7.2. Comparisons based on data from one process
7.2.5. What intervals contain a fixed percentage of the population values?
|Definitions of order statistics and ranks||
For a series of measurements Y1, ...,
YN, denote the data ordered in increasing order
of magnitude by Y, ...,
Y[N]. These ordered data are called order
statistics. If Y[j] is the order statistic that
corresponds to the measurement Yi,
then the rank for Yi is
|Definition of percentiles||
Order statistics provide a way of estimating proportions of the
data that should fall above and below a given value, called a
percentile. The pth percentile is a value,
Y(p), such that at most (100p)% of the
measurements are less than this value and at most 100(1- p)% are
greater. The 50th percentile is called the median.
Percentiles split a set of ordered data into hundredths. (Deciles split ordered data into tenths). For example, 70% of the data should fall below the 70th percentile.
|Estimation of percentiles||
Percentiles can be estimated from N measurements as follows:
for the pth percentile, set p(N+1) equal to
k + d for k an integer, and d, a fraction
greater than or equal to 0 and less than 1.
|Example and interpretation||
For the purpose of illustration, twelve measurements from a
gage study are shown
below. The measurements are resistivities of silicon wafers
measured in ohm.cm.
i Measurements Order stats Ranks 1 95.1772 95.0610 9 2 95.1567 95.0925 6 3 95.1937 95.1065 10 4 95.1959 95.1195 11 5 95.1442 95.1442 5 6 95.0610 95.1567 1 7 95.1591 95.1591 7 8 95.1195 95.1682 4 9 95.1065 95.1772 3 10 95.0925 95.1937 2 11 95.1990 95.1959 12 12 95.1682 95.1990 8To find the 90% percentile, p(N+1) = 0.9(13) =11.7; k = 11, and d = 0.7. From condition (1) above, Y(0.90) is estimated to be 95.1981 ohm.cm. This percentile, although it is an estimate from a small sample of resistivities measurements, gives an indication of the percentile for a population of resistivity measurements.
|Note that there are other ways of calculating percentiles in common use||
Some software packages (EXCEL, for example) set
1+p(N-1) equal to k + d, then proceed
as above. The two methods give fairly similar results.
A third way of calculating percentiles (given in some elementary textbooks) starts by calculating pN. If that is not an integer, round up to the next highest integer k and use Y[k] as the percentile estimate. If pN is an integer k, use .5(Y[k] +Y[k+1]).
|Definition of Tolerance Interval||An interval covering population percentiles can be interpreted as "covering a proportion p of the population with a level of confidence, say, 90%." This is known as a tolerance interval.|