我正在 r 中处理小规模调查数据。
我将非常感谢您输入什么最好/最简单的测试来显示一系列选项(opt1-opt9)的组差异之间的任何行显着性。当我的数据被分组/聚合时,它看起来像这样(受访者可以多选选项):
选择 | group1_count | group1_percent | group2_count | group2_percent | diff_% |
---|---|---|---|---|---|
opt1 | 14 | 0.081395349 | 17 | 0.042821159 | 0.038574 |
opt2 | 23 | 0.13372093 | 59 | 0.14861461 | -0.01489 |
opt3 | 29 | 0.168604651 | 65 | 0.16372796 | 0.004877 |
opt4 | 6 | 0.034883721 | 6 | 0.01511335 | 0.01977 |
opt5 | 2 | 0.011627907 | 7 | 0.017632242 | -0.006 |
opt6 | 38 | 0.220930233 | 88 | 0.221662469 | -0.00073 |
opt7 | 37 | 0.215116279 | 98 | 0.246851385 | -0.03174 |
opt8 | 11 | 0.063953488 | 25 | 0.062972292 | 0.000981 |
opt9 | 12 | 0.069767442 | 32 | 0.080604534 | -0.01084 |
t 检验在这里是否有效以显示第 1 组和第 2 组之间是否存在显着差异?如果是,是否有一种简单的方法可以在 r 中生成此行?如果没有,您有什么建议吗?
这是前 3 行作为 dput:
structure(list(opt = c("opt1", "opt2", "opt3"), group1_count = c(14,
23, 29), group1_percent = c(0.081395349, 0.13372093, 0.168604651
), group2_count = c(17, 59, 65), group2_percent = c(0.042821159,
0.14861461, 0.16372796), percent_diff = c(0.03857419, -0.01489368,
0.00487669099999999)), row.names = c(NA, -3L), class = c("tbl_df",
"tbl", "data.frame"))
非常感谢
答案 0 :(得分:2)
如果只想比较第一行的两组,可以执行two-proportion z-test。例如在 R 中:
result <- prop.test(x = c(14, 17), n = c(172, 397))
其中 172
= sum(group1_count
) 和 397
= sum(group2_count
)
输出:
2-sample test for equality of proportions with continuity correction
data: c(14, 17) out of c(172, 397)
X-squared = 2.758, df = 1, p-value = 0.09677
alternative hypothesis: two.sided
95 percent confidence interval:
-0.01105128 0.08819966
sample estimates:
prop 1 prop 2
0.08139535 0.04282116
如果您想一次性比较所有比例,可以使用卡方检验:
data <- as.table(cbind(c(14, 23, 29, 6, 2, 38, 37, 11, 12),
c(17, 59, 65, 6, 7, 88, 98, 25, 32)))
chisq <- chisq.test(data, simulate.p.value = TRUE)
输出:
Pearson's Chi-squared test with simulated p-value (based on 2000 replicates)
data: data
X-squared = 6.671, df = NA, p-value = 0.5787