我打算在df中按小组q1
对A&B
进行t.test测试
q1 q2 q3 group
1 0 1 A
0 1 0 B
1 1 1 A
0 1 0 B
然后脚本是:
t.test(subset(df,group==A,select = c("q1")),subset(df,group==B,select = c("q1")),alternative = "two.sided")
我为t.test脚本创建了一个函数:
x<-function(qnum){t.test(subset(df,group==A,select = c("qnum")),subset(df,group==B,select = c("qnum")),alternative = "two.sided")}
然后我认为apply
可以给我q1,q2,q3...
y<-select(df,grep("q\\d",colnames(df),perl=TRUE))
apply(y,2,x)
但有错误:
Error in `[.data.frame`(x, r, vars, drop = drop) :
如何自动获取多列的t.test结果?
答案 0 :(得分:4)
您可以使用t.test()
中的公式更好地处理此问题。例如,t.test(q1 ~ group, data = df)
。
下面我将使用公式模拟数据进行演示,然后使用lapply()
为每列运行t.test()
(group
除外):
# Create data
set.seed(123) # This makes sampling replicable
d <- data.frame(
q1 = rnorm(20),
q2 = rnorm(20),
q3 = rnorm(20),
group = sample(c("A", "B"), size = 20, replace = TRUE)
)
head(d)
#> q1 q2 q3 group
#> 1 -0.56047565 -1.0678237 -0.6947070 B
#> 2 -0.23017749 -0.2179749 -0.2079173 A
#> 3 1.55870831 -1.0260044 -1.2653964 A
#> 4 0.07050839 -0.7288912 2.1689560 A
#> 5 0.12928774 -0.6250393 1.2079620 A
#> 6 1.71506499 -1.6866933 -1.1231086 B
# Example of using a formula
t.test(d$q1 ~ d$group)
#>
#> Welch Two Sample t-test
#>
#> data: d$q1 by d$group
#> t = -0.76262, df = 17.323, p-value = 0.4559
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> -1.2294678 0.5759458
#> sample estimates:
#> mean in group A mean in group B
#> -0.05443279 0.27232820
# How to apply t.test to every column with lapply()
# - d[,-4] is all data excluding `group` variable
lapply(d[,-4], function(i) t.test(i ~ d$group))
#> $q1
#>
#> Welch Two Sample t-test
#>
#> data: i by d$group
#> t = -0.76262, df = 17.323, p-value = 0.4559
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> -1.2294678 0.5759458
#> sample estimates:
#> mean in group A mean in group B
#> -0.05443279 0.27232820
#>
#>
#> $q2
#>
#> Welch Two Sample t-test
#>
#> data: i by d$group
#> t = -1.6467, df = 17.731, p-value = 0.1172
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> -1.2881952 0.1568201
#> sample estimates:
#> mean in group A mean in group B
#> -0.3906697 0.1750179
#>
#>
#> $q3
#>
#> Welch Two Sample t-test
#>
#> data: i by d$group
#> t = 0.52889, df = 13.016, p-value = 0.6058
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> -0.7569843 1.2478547
#> sample estimates:
#> mean in group A mean in group B
#> 0.253746354 0.008311147