Question

我正在寻找以下方法的快速方法（不会创建太多的新数据帧）：

想象一下，我有两个变量：data$occupation（行，从上到下，＆＃34; 1＆＃34;到＆＃34; 4＆＃34;）和data$disease（列，从左到右，＆＃34;是＆＃34;和＆＃34; no＆＃34;），包含以下数据：

mat1<-matrix(c(54,23,28,45,16,10,17,13), 4,2)

我想最终得到一张比例为＆＃34;是＆＃34;在不同类别的职业＆＃34;中，职业比例与这种差异的置信区间的百分比差异：

prop.test(table(data$occupation, data$disease), correct=FALSE)，我得到了不同的比例，但现在我想找到一个突击队，它给出了相关CI之间的比例（我可以参考）。

像twoby2()（给出OR和RR）之类的东西会很好。

Answer 1

我是统计数据的新手，然而，关于this posting，我试试这个

tab <- table(data$occupation, data$disease)
combinations <- t(combn(nrow(tab), 2))
cbind(combinations, t(apply(combinations, 1, function(rows) {
  re <- prop.test(x=tab[rows, 1], n=rep(nrow(data), 2), correct=F)
  re$estimate <- unname(re$estimate)
  return(c(
    propY1 = re$estimate[1], 
    propY2 = re$estimate[2],
    diff = re$estimate[1]-re$estimate[2], 
    l = re$conf.int[1],
    u = re$conf.int[2]
  ))
})))
#             propY1    propY2        diff           l           u
# [1,] 1 2 0.2621359 0.1116505  0.15048544  0.07661756  0.22435331
# [2,] 1 3 0.2621359 0.1359223  0.12621359  0.05007540  0.20235179
# [3,] 1 4 0.2621359 0.2184466  0.04368932 -0.03871570  0.12609434
# [4,] 2 3 0.1116505 0.1359223 -0.02427184 -0.08783067  0.03928698
# [5,] 2 4 0.1116505 0.2184466 -0.10679612 -0.17774178 -0.03585045
# [6,] 3 4 0.1359223 0.2184466 -0.08252427 -0.15583081 -0.00921773

R的比例差异的置信区间

1 个答案: