R中的组间均值

时间:2018-09-25 23:29:18

标签: r

我一直在尝试获取数据框中各组之间的平均值和标准差,但没有成功。

使用示例更容易解释。

sample <- c("CT", "CT", "CT", "CT", "CT", "CT", "X1", "X1", "X1", "X1", "X1", "X1") 
test <- c("AS", "AS", "AS", "AS", "AS", "AS", "AS", "AS", "AS", "AS", "AS", "AS") 
replicate <- c("a", "a", "a", "a", "b", "b", "a", "a", "a", "a", "b", "b")
xvalue <- c(1,1,2,2,1,1,1,1,2,2,1,1)
moduli<- c("G1", "G2", "G1", "G2", "G1", "G2", "G1", "G2", "G1", "G2", "G1", "G2" ) 
yvalue <- c(12, 15, 34, 23, 23, 23, 54, 23, 24, 21, 12, 11)

df <- data.frame(sample, test, replicate, moduli, xvalue, yvalue)


obs. sample test replicate moduli xvalue yvalue
1      CT   AS         a     G1      1     12
2      CT   AS         a     G2      1     15
3      CT   AS         a     G1      2     34
4      CT   AS         a     G2      2     23
5      CT   AS         b     G1      1     23
6      CT   AS         b     G2      1     23
7      X1   AS         a     G1      1     54
8      X1   AS         a     G2      1     23
9      X1   AS         a     G1      2     24
10     X1   AS         a     G2      2     21
11     X1   AS         b     G1      1     12
12     X1   AS         b     G2      1     11

我需要做的是按sampletestmoduli分组,并在yvalue之间获得replicate的均值和标准差。因此,在此示例中,这将是obs. 1和5、2和6、7和11、8和12之间的均值和sd。

我猜测可以使用aggregatedplyr来实现这一目标,但到目前为止还没有成功。

谢谢!

1 个答案:

答案 0 :(得分:0)

如果我对您的理解正确,那么您希望通过观察yvalue1获得5的平均值,因为关于您提到的分组变量和{ {1}}以及类似的观察值xvalue2等。如果是这种情况,您还需要将6作为分组变量< / p>

xvalue

这将计算组中不同library(dplyr) df %>% group_by(sample, test, moduli, xvalue) %>% summarise(mean.y = mean(yvalue), sd.y = sd(yvalue)) # A tibble: 8 x 6 # Groups: sample, test, moduli [?] sample test moduli xvalue mean.y sd.y <fct> <fct> <fct> <dbl> <dbl> <dbl> 1 CT AS G1 1 17.5 7.78 2 CT AS G1 2 34 NaN 3 CT AS G2 1 19 5.66 4 CT AS G2 2 23 NaN 5 X1 AS G1 1 33 29.7 6 X1 AS G1 2 24 NaN 7 X1 AS G2 1 17 8.49 8 X1 AS G2 2 21 NaN 上的平均值。但是除非在该组中有多个观察值,否则无法计算SD。