我想使用动态输入列名称按条件分组。
df:
col1
a
b
c
d
a
c
d
b
a
b
d
我创建了如下函数
fun1 <- function(df,column_name){
col_name1 = noquote(column_name)
out_df = df %>% group_by(col_name1)%>%dplyr::summarise('Count'=n())
return(out_df)
}
where column_name is string. Example: column_name = 'col1'
当应用该函数时,它给出以下错误:
Error: Must group by variables found in `.data`.
* Column `col_name1` is not found.
即使列存在,我也遇到上述错误。我哪里出错了?
答案 0 :(得分:1)
library(dplyr)
fun1 <- function(df,column_name){
col_name1 <- sym(column_name)
out_df <- df %>%
group_by(!!col_name1) %>%
summarise('Count' = n())
return(out_df)
}
fun1(iris, "Species")
# A tibble: 3 x 2
Species Count
<fct> <int>
1 setosa 50
2 versicolor 50
3 virginica 50
这也应该有效,优点是能够使用多个字符串:
fun1 <- function(df, column_name){
df %>%
group_by(across(one_of(column_name))) %>%
summarise('Count' = n())
}
答案 1 :(得分:0)
您可以使用 .data
代词 -
fun1 <- function(df,column_name){
out_df = df %>% group_by(.data[[column_name]]) %>% summarise(Count = n())
return(out_df)
}
fun1(df, 'col1')
# col1 Count
# <chr> <int>
#1 a 3
#2 b 3
#3 c 2
#4 d 3
这也可以用 count
编写,其工作方式相同 -
fun2 <- function(df,column_name){
df %>% count(.data[[column_name]], name = 'Count')
}
fun2(df, 'col1')