我有一个(大)数据集,看起来像这样:-
dat <- data.frame(m=c(rep("a",4),rep("b",3),rep("c",2)),
n1 =round(rnorm(mean = 20,sd = 10,n = 9)))
g <- rnorm(20,10,5)
dat
m n1
1 a 15.132
2 a 17.723
3 a 3.958
4 a 19.239
5 b 11.417
6 b 12.583
7 b 32.946
8 c 11.970
9 c 26.447
我想用向量g
像这样对“ m”的每个类别进行t检验
n1.a <- c(15.132,17.723,3.958,19.329)
我需要像t.test(n1.a,g)
我最初考虑使用split(dat,dat$m)
将它们分解为列表,然后
然后使用lapply
,但不起作用。
有什么想法吗?
答案 0 :(得分:2)
这是使用tidyverse
中的map
的{{1}}解决方案:
purrr
或者,如上所述,使用dat %>%
split(.$m) %>%
map(~ t.test(.x$n1, g), data = .x$n1)
,它将所有t检验统计信息存储在列表中(或者使用lapply
的较短版本,感谢@markus):
by
或
dat <- split(dat, dat$m)
dat <- lapply(dat, function(x) t.test(x$n1, g))
哪个给我们:
dat <- by(dat, m, function(x) t.test(x$n1, g))
答案 1 :(得分:1)
您可以在R基中执行
lapply(split(dat, dat$m), function(x) t.test(x$n1, g))
输出
$a
Welch Two Sample t-test
data: x$n1 and g
t = 1.9586, df = 3.2603, p-value = 0.1377
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-6.033451 27.819258
sample estimates:
mean of x mean of y
21.0000 10.1071
$b
Welch Two Sample t-test
data: x$n1 and g
t = 2.3583, df = 2.3202, p-value = 0.1249
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-5.96768 25.75349
sample estimates:
mean of x mean of y
20.0000 10.1071
$c
Welch Two Sample t-test
data: x$n1 and g
t = 13.32, df = 15.64, p-value = 6.006e-10
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
13.77913 19.00667
sample estimates:
mean of x mean of y
26.5000 10.1071
数据
set.seed(1)
dat <- data.frame(m=c(rep("a",4),rep("b",3),rep("c",2)),
n1 =round(rnorm(mean = 20,sd = 10,n = 9)))
g <- rnorm(20,10,5)