假设我有一个名为“Data”的数据框,如下所示:
View(Data)
Ball Day Expansion
Red 1 5
Red 1 8
Red 1 3
Red 2 7
Red 2 9
Blue 1 5
Blue 1 3
Blue 2 7
Blue 2 5
Blue 2 4
...
我想从这组数据中得到均值(SE)、标准差(SD)和均值的标准误差,使最终产品看起来像这样
#Note: 'Expansion' value shown is showing the mean of the group, 'x' and 'y' are the result of the SE and SD
Ball Day Expansion SE SD
Red 1 7 X Y
Red 2 5 X Y
Red 3 6 X Y
Red 4 5 X Y
Blue 1 4 X Y
Blue 2 8 X Y
Blue 3 6 X Y
...
有人知道如何做到这一点吗?
答案 0 :(得分:5)
我希望这就是你的想法:
library(dplyr)
df %>%
group_by(Ball, Day) %>%
summarise(across(Expansion, list(Mean = mean,
SD = sd,
SE = function(x) sqrt(var(x)/length(x))),
.names = "{.fn}.{.col}"))
# A tibble: 4 x 5
# Groups: Ball [2]
Ball Day Mean.Expansion SD.Expansion SE.Expansion
<chr> <dbl> <dbl> <dbl> <dbl>
1 Blue 1 4 1.41 1
2 Blue 2 5.33 1.53 0.882
3 Red 1 5.33 2.52 1.45
4 Red 2 8 1.41 1
正如亲爱的@www 所建议的那样,summarise
函数的输出更简洁,但是,mutate
输出更接近您在问题中所拥有的:
# A tibble: 10 x 6
# Groups: Ball, Day [4]
Ball Day Expansion Mean.Expansion SD.Expansion SE.Expansion
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Red 1 5 5.33 2.52 1.45
2 Red 1 8 5.33 2.52 1.45
3 Red 1 3 5.33 2.52 1.45
4 Red 2 7 8 1.41 1
5 Red 2 9 8 1.41 1
6 Blue 1 5 4 1.41 1
7 Blue 1 3 4 1.41 1
8 Blue 2 7 5.33 1.53 0.882
9 Blue 2 5 5.33 1.53 0.882
10 Blue 2 4 5.33 1.53 0.882
数据:
df <- tribble(
~Ball, ~Day, ~Expansion,
"Red", 1, 5,
"Red", 1, 8,
"Red", 1, 3,
"Red", 2, 7,
"Red", 2, 9,
"Blue", 1, 5,
"Blue", 1, 3,
"Blue", 2, 7,
"Blue", 2, 5,
"Blue", 2, 4
)
答案 1 :(得分:3)
这是一种方法。我们可以使用 dplyr
包进行此类计算
library(dplyr)
Data2 <- Data %>%
group_by(Ball, Day) %>%
summarize(Mean = mean(Expansion),
SE = sd(Expansion)/sqrt(n()),
SD = sd(Expansion)) %>%
rename(Expansion = Mean) %>%
ungroup()
Data2
# # A tibble: 4 x 5
# Ball Day Expansion SE SD
# <chr> <int> <dbl> <dbl> <dbl>
# 1 Blue 1 4 1 1.41
# 2 Blue 2 5.33 0.882 1.53
# 3 Red 1 5.33 1.45 2.52
# 4 Red 2 8 1 1.41
数据
Data <- read.table(
text = "Ball Day Expansion
Red 1 5
Red 1 8
Red 1 3
Red 2 7
Red 2 9
Blue 1 5
Blue 1 3
Blue 2 7
Blue 2 5
Blue 2 4", header = TRUE
)