|
7.
Product and Process Comparisons
7.2. Comparisons based on data from one process 7.2.5. What intervals contain a fixed percentage of the population values?
|
|||
| Definitions of order statistics and ranks |
For a series of measurements Y1, ...,
YN, denote the data ordered in increasing order
of magnitude by Y[1], ...,
Y[N]. These ordered data are called order
statistics. If Y[j] is the order statistic that
corresponds to the measurement Yi,
then the rank for Yi is
j; i.e.,
|
||
| Definition of percentiles |
Order statistics provide a way of estimating proportions of the
data that should fall above and below a given value, called a
percentile. The pth percentile is a value,
Y(p), such that at most (100p)% of the
measurements are less than this value and at most 100(1- p)% are
greater. The 50th percentile is called the median.
Percentiles split a set of ordered data into hundredths. (Deciles split ordered data into tenths). For example, 70% of the data should fall below the 70th percentile. |
||
| Estimation of percentiles |
Percentiles can be estimated from N measurements as follows:
for the pth percentile, set p(N+1) equal to
k + d for k an integer, and d, a fraction
greater than or equal to 0 and less than 1.
|
||
| Example and interpretation |
For the purpose of illustration, twelve measurements from a
gage study are shown
below. The measurements are resistivities of silicon wafers
measured in ohm.cm.
i Measurements Order stats Ranks
1 95.1772 95.0610 9
2 95.1567 95.0925 6
3 95.1937 95.1065 10
4 95.1959 95.1195 11
5 95.1442 95.1442 5
6 95.0610 95.1567 1
7 95.1591 95.1591 7
8 95.1195 95.1682 4
9 95.1065 95.1772 3
10 95.0925 95.1937 2
11 95.1990 95.1959 12
12 95.1682 95.1990 8
To find the 90% percentile, p(N+1) = 0.9(13) =11.7;
k = 11, and d = 0.7. From condition (1) above,
Y(0.90) is estimated to be 95.1981
ohm.cm. This percentile, although it is an estimate
from a small sample of resistivities measurements, gives an
indication of the percentile for a population of resistivity
measurements.
|
||
| Note that there are other ways of calculating percentiles in common use |
Some software packages (EXCEL, for example) set
1+p(N-1) equal to k + d, then proceed
as above. The two methods give fairly similar results.
A third way of calculating percentiles (given in some elementary textbooks) starts by calculating pN. If that is not an integer, round up to the next highest integer k and use Y[k] as the percentile estimate. If pN is an integer k, use .5(Y[k] +Y[k+1]). |
||
| Definition of Tolerance Interval | An interval covering population percentiles can be interpreted as "covering a proportion p of the population with a level of confidence, say, 90%." This is known as a tolerance interval. | ||