当您group_by
多个变量时,dplyr
会有助于找到这些组的交集。
例如,
mtcars %>%
group_by(cyl, am) %>%
summarise(mean(disp))
产量
Source: local data frame [6 x 3]
Groups: cyl [?]
cyl am `mean(disp)`
<dbl> <dbl> <dbl>
1 4 0 135.8667
2 4 1 93.6125
3 6 0 204.5500
4 6 1 155.0000
5 8 0 357.6167
6 8 1 326.0000
我的问题是,有没有办法提供多个变量,但总结略微?我希望输出就像你手工完成的那样,变量变量。
df_1 <-
mtcars %>%
group_by(cyl) %>%
summarise(est = mean(disp)) %>%
transmute(group = paste0("cyl_", cyl), est)
df_2 <-
mtcars %>%
group_by(am) %>%
summarise(est = mean(disp)) %>%
transmute(group = paste0("am_", am), est)
bind_rows(df_1, df_2)
以上代码产生
# A tibble: 5 × 2
group est
<chr> <dbl>
1 cyl_4 105.1364
2 cyl_6 183.3143
3 cyl_8 353.1000
4 am_0 290.3789
5 am_1 143.5308
理想情况下,语法类似于
mtcars %>%
group_by(cyl, am, intersection = FALSE) %>%
summarise(est = mean(disp))
tidyverse
中是否存在类似的内容?
(ps,我知道上面表中的group
变量并不是很整洁,因为它包含两个变量,但我保证我的目的是整洁,好吗?:))< / p>
答案 0 :(得分:4)
我猜你正在寻找的是tidyr
包......
gather
首先复制数据集,以便每个因子都有n行,通过这些行进行分组; mutate
然后创建分组变量。
library(dplyr)
library(tidyr)
mtcars %>%
gather(col, value, cyl, am) %>%
mutate(group = paste(col, value, sep = "_")) %>%
group_by(group) %>%
summarise(est = mean(disp))
答案 1 :(得分:1)
purrr
替代方案:
library(tidyverse)
map(c('cyl', 'am'),
~ mtcars %>%
group_by_(.x) %>%
summarise(est = mean(disp)) %>%
transmute_(group = lazyeval::interp(~paste0(.x, '_', y), y = as.name(.x)),
~est)) %>%
bind_rows()
# A tibble: 5 × 2 group est <chr> <dbl> 1 cyl_4 105.1364 2 cyl_6 183.3143 3 cyl_8 353.1000 4 am_0 290.3789 5 am_1 143.5308