我有一个包含各种项目工作活动详情的tibble。我正在尝试编写一个泛型函数,它可以使用dplyr
动词对tibble运行相当简单的查询,格式为:
df %>%
group_by(user_id) %>%
summarise(total_time = sum(duration))
这很简单,只需要一个分组变量和摘要变量。当我尝试将函数概括为接受多个分组和/或汇总变量时,我的问题出现了。我试图使用下面的三个函数来做到这一点(所以这里有相当多的代码)。
proj_activity_report <- function(query_id) {
projects <- read_rds('~/work_tracker/projects.rds')
users <- read_rds('~/work_tracker/users.rds')
activity <- read_rds('~/work_tracker/activity.rds')
activity %>%
filter(project_id %in% query_id) %>%
left_join(projects, by = 'project_id') %>%
left_join(users, by = c('user_id.x' = 'user_id')) %>%
mutate(full_name = paste(forename, surname),
start_date = format(date(time_started), '%d %b %Y'),
logged_date = format(date(time_logged), '%d %b %Y')) %>%
arrange(project_id, time_started) %>%
select(Activity_ID = activity_id,
Project_ID = project_id,
Activity_Type = activity_title,
Project_Title = project_title,
Worker_Name = full_name,
Date_Started = start_date,
Duration_mins = duration,
Date_Logged = logged_date,
Comments = comments,
Project_Status = project_status) }
proj_activity_grouped <- function(query_id, ...) {
grouping_vars <- quos(...)
proj_activity_report(query_id) %>%
group_by(!!!grouping_vars)
}
proj_activity_summ <- function(query_id, grouping_vars, summ_var) {
query_id <- enquo(query_id)
summ_var <- enquo(summ_var)
grouping_vars <- quos(grouping_vars)
proj_activity_grouped(query_id, !!!grouping_vars) %>%
summarise(total = sum(!!summ_var))
}
函数proj_activity_report()
工作正常,proj_activity_grouped()
似乎工作正常,因为我通过调用
proj_activity_grouped(102, Worker_Name) %>%
summarise(total_duration = sum(Duration_mins))
给出了我期望的输出:
# A tibble: 12 x 2
Worker_Name total_duration
<chr> <dbl>
1 Ahmed Khan 690
2 Craig Stanton 1245
3 Darnell Lewis 1395
4 David Silverman 960
5 Frankie Benton 1275
6 Jane Benton 855
7 Li Fan 1275
8 Maria Gomes 1200
9 Sunil Khanna 1080
10 Suzanne Watson 1380
11 Theresa Briers 1395
12 Valerie Jones 1500
(这是虚拟数据,所有名称都是假的。)
事情发生了proj_activity_summ()
。使用上面的代码我得到一个错误Error in filter_impl(.data, quo) : 'match' requires vector arguments
。我希望它与我处理变量的方式有关,但我无法弄清楚哪一点是错误的。
NB。我正在使用dplyr
版本0.7.0。