TWO SAMPLE LINEAR RANK SUM TEST

Name:

TWO SAMPLE LINEAR RANK SUM TEST Type:

Analysis Command Purpose:

Perform a two sample two sample linear rank sum test for various scores. Description:

Two sample linear rank sum tests are then based on the statistic

\( S = \sum_{i=1}^{n}{tag_i a(R_i)} \)

with \( n \) denoting the combined sample size and \( R_i \)) denoting the rank of the i-th observation. The variable tag is an indicator variable that has the value 1 for the observations from the smaller sample size and the value 0 for the observations from the larger sample size (if n1 = n2, tag will be set to 1 for the sample that the first observation comes from). The \( a(R_i) \) is a score function based on the ranks. The supported score functions are described in a Note section below.

The following test statistic is based on asymptotic normality

\( z = \frac{S - E_{0}(S)} {SD_{0}} \)

where

\( \begin{array}{lcl} SD_{0}(S) & = & \mbox{the standard deviation of } S \mbox{ under the null hypothesis} \\ & = & \frac{n1 n2}{n(n-1)} \sum_{i=1}^{n} {(a(R_{i}) - \bar{a})^{2}} \end{array} \)

\( \begin{array}{lcl} \bar{a} & = & \mbox{the average score} \\ & = & \frac{\sum_{i=1}^{n}{a(R_i)}} {n} \end{array} \)

Note that n1 denotes the sample size for the smaller sample, not necessarily the sample size of Y1.

Tied ranks use the average rank of the tied values.

Syntax 1:

If LOWER TAILED is specified, a lower tailed test is performed. If UPPER TAILED is specified, an upper tailed test is performed. If neither LOWER TAILED or UPPER TAILED is specified, a two-tailed test is performed.

Syntax 2:

This syntax performs all the two-way two sample linear rank sum tests for the listed variables. This syntax supports the TO syntax.

Examples:

Note:

SET LINEAR RANK SUM TEST SCORE <case>

where <case> is one of the following

WILCOX
This option uses Wilcoxon scores
That is, the Wilcoxon scores are simply the ranks. Using this score is essentially a rank sum test (also known as the Mann-Whitney test).
This score is primarily used to test for equal locations.
MEDIAN
This option uses median scores
That is, ranks greater than the median rank are scored as a 1 and ranks less than or equal to the median rank are scored as 0. Using this score is essentially a 2-sample median test. Median scores work best for distributions that are symmetric and heavy-tailed.
This score is primarily used to test for equal locations.
VAN DER WAERDEN
This option uses the Van Der Waerden scores
with \( \Phi^{-1} \) denoting the percent point function of the standard normal distribution. Van Der Waerden scores are the percentiles of a standard normal distribution. Using this score is essentially a 2-sample Van Der Waerden test.
This score is primarily used to test for equal locations.
SAVAGE
This option uses the Savage scores
Savage scores are the expected values of exponential order statistics minus 1 (to center the scores around 0). Savage scores are typically used to test location differences in extreme value distributions and to test scale differences in exponential distributions.
MOOD
This option uses the Mood scores
Mood scores are the square of the difference between the observation rank and the average rank.
This score is primarily used to test for equal scales.
ANSARI BRADLEY
This option uses the Ansari-Bradley scores
This score is often given in a different form, but the form given here is useful for computational purposes.
This score is primarily used to test for equal scales.
KLOTZ
This option uses the Klotz scores
This score is the square of the Van Der Waerden score. Using this score is essentially a 2-sample Klotz test.
This score is primarily used to test for equal scales.
CONOVER
This option uses the Conover scores
where
That is, the Conover scores are the squared ranks of the absolute deviations from the group mean. Using this score is essentially a 2-sample squared ranks test.
This score is primarily used to test for equal scales.

Note:

STATVAL	-	value of the test statistic
STATCDF	-	CDF of the test statistic
PVALUE	-	p-value of the two tailed test statistic
PVALUELT	-	p-value of the lower tailed test statistic
PVALUEUT	-	p-value of the upper tailed test statistic

CUTUPP90	-	90% upper critical value
CUTUPP95	-	95% upper critical value
CUTUP975	-	97.5% upper critical value
CUTUPP99	-	99% upper critical value
CUTUP995	-	99.5% upper critical value
CUTUP999	-	99.9% upper critical value

CUTLOW10	-	10% lower critical value
CUTLOW05	-	5% lower critical value
CUTLO025	-	2.5% lower critical value
CUTLOW01	-	1% lower critical value
CUTLO005	-	0.5% lower critical value
CUTLO001	-	0.1% lower critical value

Note:

In addition to the above LET commands, built-in statistics are supported for 30+ different commands (enter HELP STATISTICS for details).

Default:

The default score function is WILCOX Synonyms:

2 SAMPLE is a synonym for TWO SAMPLE Related Commands:

RANK SUM TEST	=	Perform a 2-sample rank sum test for location
MEDIAN TEST	=	Perform a k-sample medians test
VAN DER WAERDEN TEST	=	Perform a k-sample Van Der Waerden test
SIEGEL TUKEY TEST	=	Perform a 2-sample Siegel Tukey test
SQUARED RANKS TEST	=	Perform a k-sample squared ranks test for homogeneous variances.
KLOTZ TEST	=	Perform a k-sample Klotz test for homogeneous variances.

Applications:

Two Sample Analysis Implementation Date:

2023/07: Program:

 
. Step 1:   Read the data
.
skip 25
read shoemake.dat y1 y2
skip 0
let y x = stack y1 y2
.
. Step 2:   Generate the statistics
.
set linear rank sum test score van der waerden
let statval = linear rank sum test                        y1 y2
let statcdf = linear rank sum test cdf                    y1 y2
let pvalue  = linear rank sum test pvalue                 y1 y2
let pvallt  = linear rank sum test lower tail pvalue      y1 y2
let pvalut  = linear rank sum test upper tail pvalue      y1 y2
let statval = round(statval,2)
let statcdf = round(statcdf,2)
let pvalue  = round(pvalue,2)
let pvallt  = round(pvallt,2)
let pvalut  = round(pvalut,2)
.
print "Van Der Waerden Scores:"
print "Test Statistic:                        ^statval"
print "Test Statistic CDF:                    ^statcdf"
print "Test Statistic P-Value:                ^pvalue"
print "Test Statistic Lower Tailed P-Value:   ^pvallt"
print "Test Statistic Upper Tailed P-Value:   ^pvalut"
.
two sample linear rank sum test                y1 y2
van der waerden test                           y  x
.
set linear rank sum test score wilcox
two sample linear rank sum test                y1 y2
t test                                         y1 y2
.
set linear rank sum test score klotz
two sample linear rank sum test                y1 y2
klotz test                                     y1 y2

Van Der Waerden Scores:
Test Statistic:                        1.56
Test Statistic CDF:                    0.94
Test Statistic P-Value:                0.12
Test Statistic Lower Tailed P-Value:   0.94
Test Statistic Upper Tailed P-Value:   0.06
  
             Two Sample Two-Sided Linear Rank Sum Test
                     (Van Der Waerden Scores)
  
 First Response Variable: Y1
 Second Response Variable: Y2
  
 H0: Location1 = Location2
 Ha: Location1 not equal Location2
  
 Summary Statistics:
 Number of Observations for Sample 1:                 10
 Mean for Sample 1:                              6.02100
 Median for Sample 1:                            5.53000
 Standard Deviation for Sample 1:                1.58184
 Number of Observations for Sample 2:                 10
 Mean for Sample 2:                              5.01900
 Median for Sample 2:                            5.03500
 Standard Deviation for Sample 2:                1.10440
  
 Test (Normal Approximation):
 Test Statistic Value:                           1.56365
 Score Value:                                    3.11351
 Expected Value of Test Statistic:               0.00786
 Standard Deviation of Test Statistic:           1.98615
 CDF Value:                                      0.94105
 P-Value (2-tailed test):                        0.11790
 P-Value (lower-tailed test):                    0.94105
 P-Value (upper-tailed test):                    0.05895
  
  
             Two-Tailed Test: Normal Approximation
  
 ---------------------------------------------------------------------------
                                         Lower          Upper           Null
    Significance           Test       Critical       Critical     Hypothesis
           Level      Statistic      Value (<)      Value (>)     Conclusion
 ---------------------------------------------------------------------------
           80.0%        1.56365       -1.28155        1.28155         REJECT
           90.0%        1.56365       -1.64485        1.64485         ACCEPT
           95.0%        1.56365       -1.95996        1.95996         ACCEPT
           99.0%        1.56365       -2.57583        2.57583         ACCEPT
  
  
 THE FORTRAN COMMON CHARACTER VARIABLE LINERANK HAS JUST BEEN SET TO WILC
  
             Two Sample Two-Sided Linear Rank Sum Test
                         (Wilcoxon Scores
  
 First Response Variable: Y1
 Second Response Variable: Y2
  
 H0: Location1 = Location2
 Ha: Location1 not equal Location2
  
 Summary Statistics:
 Number of Observations for Sample 1:                 10
 Mean for Sample 1:                              6.02100
 Median for Sample 1:                            5.53000
 Standard Deviation for Sample 1:                1.58184
 Number of Observations for Sample 2:                 10
 Mean for Sample 2:                              5.01900
 Median for Sample 2:                            5.03500
 Standard Deviation for Sample 2:                1.10440
  
 Test (Normal Approximation):
 Test Statistic Value:                           1.47628
 Score Value:                                  124.50000
 Expected Value of Test Statistic:             105.00000
 Standard Deviation of Test Statistic:          13.20885
 CDF Value:                                      0.93007
 P-Value (2-tailed test):                        0.13987
 P-Value (lower-tailed test):                    0.93007
 P-Value (upper-tailed test):                    0.06993
  
  
             Two-Tailed Test: Normal Approximation
  
 ---------------------------------------------------------------------------
                                         Lower          Upper           Null
    Significance           Test       Critical       Critical     Hypothesis
           Level      Statistic      Value (<)      Value (>)     Conclusion
 ---------------------------------------------------------------------------
           80.0%        1.47628       -1.28155        1.28155         REJECT
           90.0%        1.47628       -1.64485        1.64485         ACCEPT
           95.0%        1.47628       -1.95996        1.95996         ACCEPT
           99.0%        1.47628       -2.57583        2.57583         ACCEPT
  
  
 THE FORTRAN COMMON CHARACTER VARIABLE LINERANK HAS JUST BEEN SET TO KLOT
  
             Two Sample Two-Sided Linear Rank Sum Test
                          (Klotz Scores)
  
 First Response Variable: Y1
 Second Response Variable: Y2
  
 H0: Scale1 = Scale2
 Ha: Scale1 not equal Scale2
  
 Summary Statistics:
 Number of Observations for Sample 1:                 10
 Mean for Sample 1:                              6.02100
 Median for Sample 1:                            5.53000
 Standard Deviation for Sample 1:                1.58184
 Number of Observations for Sample 2:                 10
 Mean for Sample 2:                              5.01900
 Median for Sample 2:                            5.03500
 Standard Deviation for Sample 2:                1.10440
  
 Test (Normal Approximation):
 Test Statistic Value:                           0.26908
 Score Value:                                    8.01749
 Expected Value of Test Statistic:               7.49513
 Standard Deviation of Test Statistic:           1.94130
 CDF Value:                                      0.60606
 P-Value (2-tailed test):                        0.78787
 P-Value (lower-tailed test):                    0.60606
 P-Value (upper-tailed test):                    0.39394
  
  
             Two-Tailed Test: Normal Approximation
  
 ---------------------------------------------------------------------------
                                         Lower          Upper           Null
    Significance           Test       Critical       Critical     Hypothesis
           Level      Statistic      Value (<)      Value (>)     Conclusion
 ---------------------------------------------------------------------------
           80.0%        0.26908       -1.28155        1.28155         ACCEPT
           90.0%        0.26908       -1.64485        1.64485         ACCEPT
           95.0%        0.26908       -1.95996        1.95996         ACCEPT
           99.0%        0.26908       -2.57583        2.57583         ACCEPT