An R Introduction to Statistics

Wilcoxon Signed-Rank Test

Two data samples are matched if they come from repeated observations of the same subject. Using the Wilcoxon Signed-Rank Test, we can decide whether the corresponding data population distributions are identical without assuming them to follow the normal distribution.

Example

In the built-in data set named immer, the barley yield in years 1931 and 1932 of the same field are recorded. The yield data are presented in the data frame columns Y1 and Y2.

> library(MASS)         # load the MASS package 
> head(immer) 
  Loc Var    Y1    Y2 
1  UF   M  81.0  80.7 
2  UF   S 105.4  82.3 
    .....

Problem

Without assuming the data to have normal distribution, test at .05 significance level if the barley yields of 1931 and 1932 in data set immer have identical data distributions.

Solution

The null hypothesis is that the barley yields of the two sample years are identical populations. To test the hypothesis, we apply the wilcox.test function to compare the matched samples. For the paired test, we set the "paired" argument as TRUE. As the p-value turns out to be 0.005318, and is less than the .05 significance level, we reject the null hypothesis.

> wilcox.test(immer$Y1, immer$Y2, paired=TRUE) 
 
        Wilcoxon signed rank test with continuity correction 
 
data:  immer$Y1 and immer$Y2 
V = 368.5, p-value = 0.005318 
alternative hypothesis: true location shift is not equal to 0 
 
Warning message: 
In wilcox.test.default(immer$Y1, immer$Y2, paired = TRUE) : 
  cannot compute exact p-value with ties

Answer

At .05 significance level, we conclude that the barley yields of 1931 and 1932 from the data set immer are nonidentical populations.