我一直在尝试获取数据框中各组之间的平均值和标准差,但没有成功。
使用示例更容易解释。
sample <- c("CT", "CT", "CT", "CT", "CT", "CT", "X1", "X1", "X1", "X1", "X1", "X1")
test <- c("AS", "AS", "AS", "AS", "AS", "AS", "AS", "AS", "AS", "AS", "AS", "AS")
replicate <- c("a", "a", "a", "a", "b", "b", "a", "a", "a", "a", "b", "b")
xvalue <- c(1,1,2,2,1,1,1,1,2,2,1,1)
moduli<- c("G1", "G2", "G1", "G2", "G1", "G2", "G1", "G2", "G1", "G2", "G1", "G2" )
yvalue <- c(12, 15, 34, 23, 23, 23, 54, 23, 24, 21, 12, 11)
df <- data.frame(sample, test, replicate, moduli, xvalue, yvalue)
obs. sample test replicate moduli xvalue yvalue
1 CT AS a G1 1 12
2 CT AS a G2 1 15
3 CT AS a G1 2 34
4 CT AS a G2 2 23
5 CT AS b G1 1 23
6 CT AS b G2 1 23
7 X1 AS a G1 1 54
8 X1 AS a G2 1 23
9 X1 AS a G1 2 24
10 X1 AS a G2 2 21
11 X1 AS b G1 1 12
12 X1 AS b G2 1 11
我需要做的是按sample
,test
,moduli
分组,并在yvalue
之间获得replicate
的均值和标准差。因此,在此示例中,这将是obs.
1和5、2和6、7和11、8和12之间的均值和sd。
我猜测可以使用aggregate
和dplyr
来实现这一目标,但到目前为止还没有成功。
谢谢!
答案 0 :(得分:0)
如果我对您的理解正确,那么您希望通过观察yvalue
和1
获得5
的平均值,因为关于您提到的分组变量和{ {1}}以及类似的观察值xvalue
和2
等。如果是这种情况,您还需要将6
作为分组变量< / p>
xvalue
这将计算组中不同library(dplyr)
df %>%
group_by(sample, test, moduli, xvalue) %>%
summarise(mean.y = mean(yvalue),
sd.y = sd(yvalue))
# A tibble: 8 x 6
# Groups: sample, test, moduli [?]
sample test moduli xvalue mean.y sd.y
<fct> <fct> <fct> <dbl> <dbl> <dbl>
1 CT AS G1 1 17.5 7.78
2 CT AS G1 2 34 NaN
3 CT AS G2 1 19 5.66
4 CT AS G2 2 23 NaN
5 X1 AS G1 1 33 29.7
6 X1 AS G1 2 24 NaN
7 X1 AS G2 1 17 8.49
8 X1 AS G2 2 21 NaN
上的平均值。但是除非在该组中有多个观察值,否则无法计算SD。