|
KURTOSIS OUTLIER TESTName:
The test statistic is the kurtosis coefficient \[ g_2 = \frac{n (n+1) \sum_{i=1}^{n}{(x_{i} - \bar{x})^4}} {(n-1) (n-2) (n-3) s^{4}} - \frac{3 (n-1)^2}{(n-2) (n-3)} \] with n, \( \bar{x} \) and s denoting the sample size, the sample mean and the sample standard deviation, respectively. Note that this definition is different than the one used by Dataplot's EXCESS KURTOSIS command. The critical values are obtained via simulation. The ASTM standard provides table values for n = 3 to 50 and \( \alpha \) levels of 0.10, 0.05 and 0.01. Linear interpolation is used for values of n not given in the table. Alternatively, you can perform a dynamic simulation to obtain the critical values. To specify the method used to compute the critical value, enter one of the following commands (the default is ASTM)
SET KURTOSIS OUTLIER TEST CRITICAL VALUES SIMULATION If n > 50, the simulation method will be used.
<SUBSET/EXCEPT/FOR qualification> where <y> is the response variable being tested; and where the <SUBSET/EXCEPT/FOR qualification> is optional.
<SUBSET/EXCEPT/FOR qualification> where <y1> ... <yk> is a list of up to k response variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax performs the kurtosis outlier test on <y1>, then on <y2>, and so on. Up to 30 response variables can be specified. Note that the syntax
is supported. This is equivalent to
<SUBSET/EXCEPT/FOR qualification> where <y> is the response variable; <x1> ... <xk> is a list of up to k group-id variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax performs a cross-tabulation of <x1> ... <xk> and performs a kurtosis outlier test for each unique combination of cross-tabulated values. For example, if X1 has 3 levels and X2 has 2 levels, there will be a total of 6 kurtosis outlier tests performed. Up to six group-id variables can be specified. Note that the syntax
is supported. This is equivalent to
MULTIPLE KURTOSIS OUTLIER TEST Y1 Y2 Y3 REPLICATED KURTOSIS OUTLIER TEST Y X1 X2 SKEWNESS KURTOSIS TEST Y1 SUBSET TAG > 2
The STATCDF and PVALUE are only saved when the simulation method is used to obtain critical values. If the ASTM method is used to obtain critical values, the CUTOFF80 and CUTOF975 values are not saved. If the MULTIPLE or REPLICATED option is used, these values will be written to the file "dpst1f.dat" instead.
LET A = KURTOSIS OUTLIER TEST CDF Y LET A = KURTOSIS OUTLIER TEST PVALUE Y LET A = KURTOSIS OUTLIER TEST INDEX Y LET ALPHA = <value> LET A = KURTOSIS OUTLIER TEST CRITICAL VALUE Y The KURTOSIS OUTLIER TEST, KURTOSIS OUTLIER TEST CDF, and KURTOSIS OUTLIER TEST PVALUE return the values of the test statistic, the cdf of the test statistic and the pvalue of the test statistic, respectively. For the KURTOSIS OUTLIER TEST CDF and KURTOSIS OUTLIER TEST PVALUE commands, the simulation method will be used. Otherwise, the method specified by the SET KURTOSIS OUTLIER TEST CRITICAL VALUE command will be used. The KURTOSIS OUTLIER TEST INDEX returns the row index of the most extreme value in the response variable. The most extreme value is defined as the value furtherest from the mean. The KURTOSIS OUTLIER TEST CRITICAL VALUE returns the critical value for the specified value of ALPHA. If ALPHA is not specified, it will be set to 0.05. Note that if the ASTM method is specified for the critical values, only a few select values for alpha are supported (0.01, 0.05 and 0.10). In addition to the above LET command, built-in statistics are supported for 30+ different commands (enter HELP STATISTICS for details).
Ferguson, T.S. (1961), "On the Rejection of Outliers," Fourth Berkeley Symposium on Mathematical Statistics and Probability, edited by Jerzy Neyman, University of California Press, Berkeley and Los Angeles, CA. Ferguson, T.S. (1961), "Rules for Rejection of Outliers," Revue Inst. Int. de Stat., RINSA, Vol. 29, No. 3, pp. 29-43.
. Step 1: Read the data (from ASTM E-178 document) . read y -1.40 -0.44 -0.30 -0.24 -0.22 -0.13 -0.05 0.06 0.10 0.18 0.20 0.39 0.48 0.63 1.01 end of data . . Step 2: Compute the statistics . let stat = kurtosis outlier test y set kurtosis outlier test critical values astm let cv1 = kurtosis outlier test critical value y set kurtosis outlier test critical values simulation let cv2 = kurtosis outlier test critical value y . let pval = kurtosis outlier test pvalue y let statcdf = kurtosis outlier test cdf y let iindx = kurtosis outlier test index y . set write decimals 3 print stat cv1 cv2 pval statcdf iindx . set kurtosis outlier test critical values astm kurtosis outlier test y set kurtosis outlier test critical values simulation kurtosis outlier test yThe following output is generated PARAMETERS AND CONSTANTS-- STAT -- 2.529 CV1 -- 2.145 CV2 -- 2.150 PVAL -- 0.037 STATCDF -- 0.967 IINDX -- 1.000 THE FORTRAN COMMON CHARACTER VARIABLE KURTOUTL HAS JUST BEEN SET TO ASTM Kurtosis Test for Outliers (Assumption: Normality) Response Variable: Y H0: The most extreme point is not an outlier Ha: The most extreme point is not an outlier Potential outlier value tested: -1.400 ID for potential outlier: 1 Summary Statistics: Number of Observations: 15 Sample Minimum: -1.400 Sample Maximum: 1.010 Sample Mean: 0.018 Sample SD: 0.551 Sample Kurtosis: 2.529 Kurtosis Outlier Test Statistic Value: 2.529 Conclusions (Upper 1-Tailed Test) ------------------------------------------------------------- Alpha CDF Statistic Critical Value Conclusion ------------------------------------------------------------- 10% 90% 2.529 1.422 Reject H0 5% 95% 2.529 2.145 Reject H0 1% 99% 2.529 3.887 Accept H0 Critical Values Based on ASTM E-178 Tables THE FORTRAN COMMON CHARACTER VARIABLE KURTOUTL HAS JUST BEEN SET TO SIMU Kurtosis Test for Outliers (Assumption: Normality) Response Variable: Y H0: The most extreme point is not an outlier Ha: The most extreme point is not an outlier Potential outlier value tested: -1.400 ID for potential outlier: 1 Summary Statistics: Number of Observations: 15 Sample Minimum: -1.400 Sample Maximum: 1.010 Sample Mean: 0.018 Sample SD: 0.551 Sample Kurtosis: 2.529 Kurtosis Outlier Test Statistic Value: 2.529 CDF Value: 0.965 P-Value 0.035 Conclusions (Upper 1-Tailed Test) ------------------------------------------------------------- Alpha CDF Statistic Critical Value Conclusion ------------------------------------------------------------- 20% 80% 2.529 0.709 Reject H0 10% 90% 2.529 1.414 Reject H0 5% 95% 2.529 2.138 Reject H0 2.5% 97.5% 2.529 2.886 Accept H0 1% 99% 2.529 3.969 Accept H0 0.5% 99.5% 2.529 4.683 Accept H0 Critical Values Based on 50,000 Simulations
Date created: 01/22/2020 |
Last updated: 12/11/2023 Please email comments on this WWW page to alan.heckert@nist.gov. |