7.
Product and Process Comparisons
7.2. Comparisons based on data from one process 7.2.5. What intervals contain a fixed percentage of the population values?


Definitions of order statistics and ranks 
For a series of measurements Y_{1}, ...,
Y_{N}, denote the data ordered in increasing order
of magnitude by Y_{[1]}, ...,
Y_{[N]}. These ordered data are called order
statistics. If Y_{[j]} is the order statistic that
corresponds to the measurement Y_{i},
then the rank for Y_{i} is
j; i.e.,


Definition of percentiles 
Order statistics provide a way of estimating proportions of the
data that should fall above and below a given value, called a
percentile. The pth percentile is a value,
Y_{(p)}, such that at most (100p)% of the
measurements are less than this value and at most 100(1 p)% are
greater. The 50th percentile is called the median.
Percentiles split a set of ordered data into hundredths. (Deciles split ordered data into tenths). For example, 70% of the data should fall below the 70th percentile. 

Estimation of percentiles 
Percentiles can be estimated from N measurements as follows:
for the pth percentile, set p(N+1) equal to
k + d for k an integer, and d, a fraction
greater than or equal to 0 and less than 1.


Example and interpretation 
For the purpose of illustration, twelve measurements from a
gage study are shown
below. The measurements are resistivities of silicon wafers
measured in ohm^{.}cm.
i Measurements Order stats Ranks 1 95.1772 95.0610 9 2 95.1567 95.0925 6 3 95.1937 95.1065 10 4 95.1959 95.1195 11 5 95.1442 95.1442 5 6 95.0610 95.1567 1 7 95.1591 95.1591 7 8 95.1195 95.1682 4 9 95.1065 95.1772 3 10 95.0925 95.1937 2 11 95.1990 95.1959 12 12 95.1682 95.1990 8To find the 90% percentile, p(N+1) = 0.9(13) =11.7; k = 11, and d = 0.7. From condition (1) above, Y(0.90) is estimated to be 95.1981 ohm^{.}cm. This percentile, although it is an estimate from a small sample of resistivities measurements, gives an indication of the percentile for a population of resistivity measurements. 

Note that there are other ways of calculating percentiles in common use 
Some software packages (EXCEL, for example) set
1+p(N1) equal to k + d, then proceed
as above. The two methods give fairly similar results.
A third way of calculating percentiles (given in some elementary textbooks) starts by calculating pN. If that is not an integer, round up to the next highest integer k and use Y_{[k]} as the percentile estimate. If pN is an integer k, use .5(Y_{[k]} +Y_{[k+1]}). 

Definition of Tolerance Interval  An interval covering population percentiles can be interpreted as "covering a proportion p of the population with a level of confidence, say, 90%." This is known as a tolerance interval. 