我有这个数据框:
┌────────┬──────────┬─────────┐ │ s_test │ s_method │ s_error │ ├────────┼──────────┼─────────┤ │ n32 │ A │ 0.12 │ │ n32 │ B │ 0.54 │ │ n32 │ A │ 0.48 │ │ n32 │ B │ 0.25 │ │ n33 │ A │ 0.52 │ │ n33 │ B │ 0.42 │ │ n33 │ A │ 0.13 │ │ n33 │ B │ 0.18 │ └────────┴──────────┴─────────┘
structure(list(s_test = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L), .Label = c("n32", "n33"), class = "factor"), s_method = structure(c(1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("A", "B"), class = "factor"),
s_error = c(0.12, 0.54, 0.48, 0.25, 0.52, 0.42, 0.13, 0.18
)), class = "data.frame", row.names = c(NA, -8L))
我想按s_test
分组,并有其他列,例如mean_error_A
和mean_error_B
。
到目前为止,我已经尝试过:
test_sum <- group_by(sdata, s_test) %>%
summarise(
mean = mean(s_error, na.rm = TRUE),
)
我得到:
┌────────┬────────┐ │ s_test │ mean │ ├────────┼────────┤ │ n32 │ 0.3475 │ │ n33 │ 0.3125 │ └────────┴────────┘
我想要这个:
┌────────┬────────┬────────┐ │ s_test │ mean_A │ mean_B │ ├────────┼────────┼────────┤ │ n32 │ 0.3000 │ 0.3950 │ │ n33 │ 0.3250 │ 0.3000 │ └────────┴────────┴────────┘
答案 0 :(得分:4)
我们需要按's_test'和's_method'进行分组,然后执行spread
将“ long”格式重塑为“ wide”格式
library(dplyr)
library(tidyr)
sdata %>%
group_by(s_test, s_method = str_c('mean_', s_method)) %>%
summarise(mean = mean(s_error)) %>%
spread(s_method, mean)
# A tibble: 2 x 3
# Groups: s_test [2]
# s_test mean_A mean_B
# <fct> <dbl> <dbl>
#1 n32 0.3 0.395
#2 n33 0.325 0.3
答案 1 :(得分:2)
reshape2::dcast(data = d,
formula = s_test ~ s_method,
value.var = "s_error",
fun.aggregate = mean)
# s_test A B
#1 n32 0.300 0.395
#2 n33 0.325 0.300