SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Vol 2

VAN DER WAERDEN

Name:
    VAN DER WAERDEN
Type:
    Analysis Command
Purpose:
    Perform a Van Der Waerden (normal scores) test that k population distribution functions are equal.
Description:
    Analysis of Variance (ANOVA) is a data analysis technique for examining the significance of the factors (= independent variables) in a multi-factor model. The one factor model can be thought of as a generalization of the two sample t-test. That is, the two sample t-test is a test of the hypothesis that two population means are equal. The one factor ANOVA tests the hypothesis that k population means are equal.

    The standard ANOVA assumes that the errors (i.e., residuals) are normally distributed. If this normality assumption is not valid, an alternative is to use a non-parametric test.

    The most common non-parametric test for the one-factor model is the Kruskal-Wallis test. The Kruskal-Wallis test is based on the ranks of the data. The Van Der Waerden test converts the ranks to quantiles of the standard normal distribution (details given below). These are called normal scores and the test is computed from these normal scores.

    The advantage of the Van Der Waerden test is that it provides the high efficiency of the standard ANOVA analysis when the normality assumptions are in fact satisfied, but it also provides the robustness of the Kruskal-Wallis test when the normality assumptions are not satisfied.

    Let ni (i = 1, 2, ..., k) represent the sample sizes for each of the k groups (i.e., samples) in the data. Let N denote the sample size for all groups. Let Xij represent the ith value in the jth group. Then compute the normal scores as follows:

      \( A_{ij} = \phi^{-1}(\frac{R(X_{ij})} {N+1}) \)

    with R(Xij) and \( \phi \) denoting the rank of observation Xij and the normal percent point function, respectively.

    The average of the normal scores for each sample can then be computed as

      \( \bar{A}_{i} = \frac{1}{n_i} \sum_{j=1}^{n_i}{A_{ij}} \hspace{0.5in} i = 1, 2, ... , k \)

    The variance of the normal scores can be computed as

      \( s^2 = \frac{1}{N-1} \sum_{i=1}^{k}{\sum_{j=1}^{n_i}{A_{ij}^{2}}} \)

    The Van Der Waerden test can then be defined as follows.

      H0: All of the k population distribution functions are identical
      HA: At least one of the populations tends to yield larger observations than at least one of the other populations
      Test Statistic: \( T_1 = \frac{1}{s^2} \sum_{i=1}^{k}{n_{i} (\bar{A}_{i})^{2}} \)
      Significance Level: \( \alpha \)
      Critical Region: T1 > CHIPPF(\( \alpha \),k-1) where CHIPPF is the chi-square percent point function.
      Conclusion: Reject the null hypothesis if the test statistic is in the critical region.
Syntax:
    VAN DER WAERDEN <y> <x>             <SUBSET/EXCEPT/FOR qualification>
    where <y> is the response (= dependent) variable;
                <x> is the factor (= independent) variable;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.
Examples:
    VAN DER WAERDEN Y X
    VAN DER WAERDEN Y X SUBSET X = 1 TO 4
Note:
    If the hypothesis of identical distributions is rejected, you can perform a multiple comparisons procedure to determine which pairs of populations tend to differ.

    The populations i and j seem to be different if the following inequality is satisfied:

      \( |\bar{A}_{i} - \bar{A}_{j}| > t_{1-\alpha/2} \sqrt{s^2 \frac{N-1-T_1}{N-k}} \sqrt{\frac{1}{n_i} + \frac{1}{n_j}} \)

    with t and T1 denoting the t percent point function and the Van Der Waerden test statistic, respectively.

    Dataplot writes all the pairwise multiple comparisons to the file "dpst1f.dat" in the current directory.

Note:
    Dataplot writes the following information to the file "dpst2f.dat" in the current directory:

      Column 1 = Index
      Column 2 = Raw Data Value
      Column 3 = Rank
      Column 4 = Normal Score
      Column 5 = Group Average of the Normal Scores (Abar(i))
      Column 6 = Group Sample Sizes (ni)
Default:
    None
Synonyms:
    NORMAL SCORES TEST is a synonym for VAN DER WAERDEN TEST.
    The word TEST is optional
Related Commands: Reference:
    W. J. Conover, (1999). "Practical Nonparameteric Statistics", Third Edition, Wiley, pp. 396-406.
Applications:
    Analysis of Variance
Implementation Date:
    2004/10
Program:
     
    SKIP 25
    READ SPLETT2.DAT Y MACHINE
    SET WRITE DECIMALS 5
    VAN DER WAERDEN Y MACHINE
        
    The following output is generated.
     
                Van Der Waerden (Normal Scores) One Factor Test
     
    Response Variable: Y
    Group-ID Variable: MACHINE
     
    H0: Samples Come From Identical Populations
    Ha: Samples Do Not Come From Identical Populations
     
    Summary Statistics:
    Total Number of Observations:                      99
    Number of Groups:                                  4
     
    Variance of Normal Scores of Ranks                 0.92890
    Van Der Waerden Test Statistic Value:              39.78569
    CDF of Test Statistic:                             1.00000
    P-Value:                                           0.00000
     
     
    Percent Points of the Chi-Square Reference Distribution
    -----------------------------------
      Percent Point               Value
    -----------------------------------
                0.0    =          0.000
               50.0    =          2.366
               75.0    =          4.108
               90.0    =          6.251
               95.0    =          7.815
               97.5    =          9.348
               99.0    =         11.345
               99.9    =         16.266
     
    Conclusions (Upper 1-Tailed Test)
    ----------------------------------------------
      Alpha    CDF   Critical Value     Conclusion
    ----------------------------------------------
        10%    90%            6.251      Reject H0
         5%    95%            7.815      Reject H0
       2.5%  97.5%            9.348      Reject H0
         1%    99%           11.345      Reject H0
     
     
                Multiple Comparisons Table
     
    ---------------------------------------------------------------------------
        I    J     |Abar(i)-Abar(j         90% CV         95% CV         99% CV
    ---------------------------------------------------------------------------
        1    2             0.65906        0.35813        0.42803        0.56674
        1    3             1.57550        0.35813        0.42803        0.56674
        1    4             0.17816        0.35813        0.42803        0.56674
        2    3             0.91644        0.35446        0.42364        0.56092
        2    4             0.48089        0.35446        0.42364        0.56092
        3    4             1.39733        0.35446        0.42364        0.56092
     
        
Date created: 01/05/2006
Last updated: 12/11/2023

Please email comments on this WWW page to alan.heckert@nist.gov.