python的scipy.stats.ranksums和R的wilcox.test都应该计算Wilcoxon秩和检验的双边p值。但是当我在同一数据上运行两个函数时,我得到的p值相差几个数量级:
R:
> x=c(57.07168,46.95301,31.86423,38.27486,77.89309,76.78879,33.29809,58.61569,18.26473,62.92256,50.46951,19.14473,22.58552,24.14309)
> y=c(8.319966,2.569211,1.306941,8.450002,1.624244,1.887139,1.376355,2.521150,5.940253,1.458392,3.257468,1.574528,2.338976)
> print(wilcox.test(x, y))
Wilcoxon rank sum test
data: x and y
W = 182, p-value = 9.971e-08
alternative hypothesis: true location shift is not equal to 0
的Python:
>>> x=[57.07168,46.95301,31.86423,38.27486,77.89309,76.78879,33.29809,58.61569,18.26473,62.92256,50.46951,19.14473,22.58552,24.14309]
>>> y=[8.319966,2.569211,1.306941,8.450002,1.624244,1.887139,1.376355,2.521150,5.940253,1.458392,3.257468,1.574528,2.338976]
>>> scipy.stats.ranksums(x, y)
(4.415880433163923, 1.0059968254463979e-05)
所以R给我1e-7而Python给我1e-5。
这种差异来自何处,哪一个是“正确的”p值?
答案 0 :(得分:20)
这取决于选项的选择(精确与普通近似,有无连续性校正):
R的默认值:
默认情况下(如果未指定“exact”),则计算精确的p值 如果样本包含少于50个有限值且没有 领带。否则,使用正常近似值。
默认(如上所示):
wilcox.test(x, y)
Wilcoxon rank sum test
data: x and y
W = 182, p-value = 9.971e-08
alternative hypothesis: true location shift is not equal to 0
连续性校正的正态近似:
> wilcox.test(x, y, exact=FALSE, correct=TRUE)
Wilcoxon rank sum test with continuity correction
data: x and y
W = 182, p-value = 1.125e-05
alternative hypothesis: true location shift is not equal to 0
没有连续性校正的正态近似:
> (w0 <- wilcox.test(x, y, exact=FALSE, correct=FALSE))
Wilcoxon rank sum test
data: x and y
W = 182, p-value = 1.006e-05
alternative hypothesis: true location shift is not equal to 0
更准确一点:
w0$p.value
[1] 1.005997e-05
看起来Python给你的另一个值(4.415880433163923)是Z得分:
2*pnorm(4.415880433163923,lower.tail=FALSE)
[1] 1.005997e-05
我很高兴想知道发生了什么,但我也想指出p=1e-7
和p=1e-5
之间几乎没有任何实际差异......