按数据框列分组并将行合并到列中

时间:2019-09-03 18:23:57

标签: r dataframe

我有这个数据框:

┌────────┬──────────┬─────────┐
│ s_test │ s_method │ s_error │
├────────┼──────────┼─────────┤
│ n32    │ A        │    0.12 │
│ n32    │ B        │    0.54 │
│ n32    │ A        │    0.48 │
│ n32    │ B        │    0.25 │
│ n33    │ A        │    0.52 │
│ n33    │ B        │    0.42 │
│ n33    │ A        │    0.13 │
│ n33    │ B        │    0.18 │
└────────┴──────────┴─────────┘
structure(list(s_test = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L), .Label = c("n32", "n33"), class = "factor"), s_method = structure(c(1L, 
2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("A", "B"), class = "factor"), 
    s_error = c(0.12, 0.54, 0.48, 0.25, 0.52, 0.42, 0.13, 0.18
    )), class = "data.frame", row.names = c(NA, -8L))

我想按s_test分组,并有其他列,例如mean_error_Amean_error_B

到目前为止,我已经尝试过:

test_sum <- group_by(sdata, s_test) %>%
  summarise(
    mean = mean(s_error, na.rm = TRUE),
  )

我得到:

┌────────┬────────┐
│ s_test │  mean  │
├────────┼────────┤
│ n32    │ 0.3475 │
│ n33    │ 0.3125 │
└────────┴────────┘

我想要这个:

┌────────┬────────┬────────┐
│ s_test │ mean_A │ mean_B │
├────────┼────────┼────────┤
│ n32    │ 0.3000 │ 0.3950 │
│ n33    │ 0.3250 │ 0.3000 │
└────────┴────────┴────────┘

2 个答案:

答案 0 :(得分:4)

我们需要按's_test'和's_method'进行分组,然后执行spread将“ long”格式重塑为“ wide”格式

library(dplyr)
library(tidyr)
sdata %>%
     group_by(s_test, s_method = str_c('mean_', s_method)) %>%
     summarise(mean = mean(s_error)) %>%
     spread(s_method, mean)
# A tibble: 2 x 3
# Groups:   s_test [2]
#  s_test mean_A mean_B
#  <fct>   <dbl>  <dbl>
#1 n32     0.3    0.395
#2 n33     0.325  0.3  

答案 1 :(得分:2)

reshape2::dcast(data = d,
                formula = s_test ~ s_method,
                value.var = "s_error",
                fun.aggregate = mean)
#  s_test     A     B
#1    n32 0.300 0.395
#2    n33 0.325 0.300