7.
Product and Process Comparisons
7.2. Comparisons based on data from one process 7.2.6. What intervals contain a fixed percentage of the population values?
|
|||
Definitions of order statistics and ranks |
For a series of measurements |
||
Definition of percentiles |
Order statistics provide a way of estimating proportions of the
data that should fall above and below a given value, called a
percentile. The Percentiles split a set of ordered data into hundredths. (Deciles split ordered data into tenths). For example, 70 % of the data should fall below the 70th percentile. Given n points, the percentile corresponding to the i-th point is |
||
Estimation of percentiles |
Percentiles can be estimated from
|
||
Example and interpretation |
For the purpose of illustration, twelve measurements from a
gage study are shown
below. The measurements are resistivities of silicon wafers
measured in ohm.cm.
i Measurements Order stats Ranks 1 95.1772 95.0610 9 2 95.1567 95.0925 6 3 95.1937 95.1065 10 4 95.1959 95.1195 11 5 95.1442 95.1442 5 6 95.0610 95.1567 1 7 95.1591 95.1591 7 8 95.1195 95.1682 4 9 95.1065 95.1772 3 10 95.0925 95.1937 2 11 95.1990 95.1959 12 12 95.1682 95.1990 8To find the 90th percentile, |
||
Note that there are other ways of calculating percentiles in common use |
Hyndman and Fan (1996) in an
American Statistician article evaluated nine different methods (we
will refer to these as R1 through R9) for computing percentiles relative
to six desirable properties. Their goal was to advocate a "standard"
definition for percentiles that would be implemented in statistical
software. Although this has not in fact happened, the article does
provide a useful summary and evaluation of various methods for
computing percentiles. Most statistical and spreadsheet software
use one of the methods described in Hyndman and Fan.
The method described above corresponds to method R6 of Hyndman and Fan. This is the default method used by Dataplot.
The method advocated by Hyndman and Fan is R8. For the R8 method,
set
Some software packages set The R6, R7, and R8 methods give fairly similar, but not exactly the same (particularly for small samples), results. For most purposes, any of these three methods should be acceptable.
Another method of calculating percentiles (given in some elementary
textbooks) starts by calculating |
||
Definition of Tolerance Interval |
An interval covering population percentiles can be interpreted
as "covering a proportion |