Question

我正在尝试在数据框中对多个列进行group_by，并且我无法在group_by函数中写出每个列名，因此我想将列名称称为向量，如下所示：

cols <- colnames(mtcars)[grep("[a-z]{3,}$", colnames(mtcars))]
mtcars %>% filter(disp < 160) %>% group_by(cols) %>% summarise(n = n())

这会返回错误：

Error in mutate_impl(.data, dots) : 
  Column `mtcars[colnames(mtcars)[grep("[a-z]{3,}$", colnames(mtcars))]]` must be length 12 (the number of rows) or one, not 7

我绝对想用dplyr函数来做这件事，但是无法想出这个。

Answer 1

您可以使用group_by_at，您可以将列名称的字符向量作为组变量传递：

mtcars %>% 
    filter(disp < 160) %>% 
    group_by_at(cols) %>% 
    summarise(n = n())
# A tibble: 12 x 8
# Groups:   mpg, cyl, disp, drat, qsec, gear [?]
#     mpg   cyl  disp  drat  qsec  gear  carb     n
#   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
# 1  19.7     6 145.0  3.62 15.50     5     6     1
# 2  21.4     4 121.0  4.11 18.60     4     2     1
# 3  21.5     4 120.1  3.70 20.01     3     1     1
# 4  22.8     4 108.0  3.85 18.61     4     1     1
# ...

或者您可以使用group_by_at和列选择辅助函数在vars内移动列选择：

mtcars %>% 
    filter(disp < 160) %>% 
    group_by_at(vars(matches('[a-z]{3,}$'))) %>% 
    summarise(n = n())

# A tibble: 12 x 8
# Groups:   mpg, cyl, disp, drat, qsec, gear [?]
#     mpg   cyl  disp  drat  qsec  gear  carb     n
#   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
# 1  19.7     6 145.0  3.62 15.50     5     6     1
# 2  21.4     4 121.0  4.11 18.60     4     2     1
# 3  21.5     4 120.1  3.70 20.01     3     1     1
# 4  22.8     4 108.0  3.85 18.61     4     1     1
# ...

Answer 2

我相信group_by_at已被group_by和across的组合所取代。 summarise有一个实验性.groups参数，您可以在创建汇总对象后选择如何处理分组。这是可供考虑的替代方法：

cols <- colnames(mtcars)[grep("[a-z]{3,}$", colnames(mtcars))]

original <- mtcars %>% 
  filter(disp < 160) %>% 
  group_by_at(cols) %>% 
  summarise(n = n())

superseded <- mtcars %>%
  filter(disp < 160) %>%
  group_by(across(all_of(cols))) %>%
  summarise(n = n(), .groups = 'drop_last')

all.equal(original, superseded)

这是一篇博客文章，其中详细介绍了如何使用across函数： https://www.tidyverse.org/blog/2020/04/dplyr-1-0-0-colwise/

dplyr group by colnames描述为字符串向量

2 个答案: