我刚注意到,对于2 x 2表,其中单元格具有低频率,即使使用Yates校正,R
似乎也错误地计算了chi ^ 2统计数据。
mat <- matrix(c(3, 2, 14, 10), ncol = 2)
chi <- stats::chisq.test(mat)
## Warning message:
## In stats::chisq.test(mat) : Chi-squared approximation may be incorrect
# from the function
chi$statistic
## X-squared
## 1.626059e-31
# as it should be (with Yates correction)
sum((abs(chi$observed - chi$expected) - 0.5)^2 / chi$expected)
## [1] 0.1851001
我是否正确地认为R
计算错误,而第二种方法产生.185更准确?或者小细胞计数是否意味着所有赌注都已关闭?
更新
如果没有Yates连续性校正,它似乎工作正常:
chi <- stats::chisq.test(mat, correct = FALSE)
## Warning message:
## In stats::chisq.test(mat, correct = FALSE) :
## Chi-squared approximation may be incorrect
chi$statistic
## X-squared
## 0.004738562
sum((abs(chi$observed - chi$expected))^2 / chi$expected)
## [1] 0.004738562
答案 0 :(得分:4)
帮助文件/手册页说明
one half is subtracted from all |O - E| differences; however,
the correction will not be bigger than the differences themselves.
你的例子中的差异都小于0.5:
> chi$observed - chi$expected
[,1] [,2]
[1,] 0.06896552 -0.06896552
[2,] -0.06896552 0.06896552
所以,至少,它似乎是记录在案的行为。
附注:如果有疑问,您显然可以使用模拟找到的p值
> chi <- stats::chisq.test(mat, simulate.p.value=TRUE, B=1e6)
> chi
Pearson's Chi-squared test with simulated p-value (based on 1e+06 replicates)
data: mat
X-squared = 0.0047386, df = NA, p-value = 1
在这种情况下,在中间某处找到一个卡方并消除警告。或者使用fisher.test
...