我试图弄清楚如何处理自定义聚合函数,该函数将与dplyr的评估原则一起使用。我想创建一个形状函数:
custom_aggregation <- function (data, stat_funs = list(mean, median), agg_col, ...)
其中data
是data.frame,stat_funs
是要应用的功能的列表,agg_col
指示要在哪个列上应用这些功能,...
正在分组列。
对于单个聚集功能,我使用如下代码:
custom_aggregation <- function (data, stat_fun, agg_col, ...) {
groups <- enquos(...)
agg_col <- enquo(agg_col)
stat_fun_enq <- enquo(stat_fun)
agg_name <- paste0(quo_name(agg_col), '', quo_name(stat_fun_enq))
data %>%
group_by(!!!groups) %>%
summarise(!!agg_name := stat_fun(!!agg_col))
}
# I can try to call the function on mtcars data.frame:
custom_aggregation(mtcars, stat_fun = mean, agg_col = qsec, cyl, am)
我不知道如何处理函数列表(stat_fun
参数)。
我尝试过:
map(stat_fun, enquo) # and the basic lapply equivalent with variants
lapply(stat_fun, function(i) {
stat_fun_enq <- enquo(i)
})
lapply(seq_along(stat_fun), function(i) {
stat_fun_enq <- enquo(stat_fun[[i]])
})
有人可以指导我做错什么吗?
答案 0 :(得分:0)
一种选择是将函数作为单子传递,然后通过map
list
,求值(!!
)来应用函数
library(tidyverse)
custom_aggregation <- function (data, stat_fun, agg_col, ...) {
groups <- enquos(...)
agg_col <- enquo(agg_col)
agg_name <- rlang::as_name(stat_fun)
data %>%
group_by(!!! groups) %>%
summarise((!!agg_name) := (!!stat_fun)(!!agg_col))
}
不清楚预期的输出格式
quos(mean, median) %>%
map(~ custom_aggregation(mtcars, stat_fun = .x, agg_col = qsec, cyl, am))
#[[1]]
# A tibble: 6 x 3
# Groups: cyl [3]
# cyl am mean
# <dbl> <dbl> <dbl>
#1 4 0 21.0
#2 4 1 18.4
#3 6 0 19.2
#4 6 1 16.3
#5 8 0 17.1
#6 8 1 14.6
#[[2]]
# A tibble: 6 x 3
# Groups: cyl [3]
# cyl am median
# <dbl> <dbl> <dbl>
#1 4 0 20.0
#2 4 1 18.6
#3 6 0 19.2
#4 6 1 16.5
#5 8 0 17.4
#6 8 1 14.6
如果我们需要一个数据集
library(rlang)
custom_aggregation <- function (data, stat_fun, agg_col, ...) {
groups <- enquos(...)
agg_col <- enquo(agg_col)
nm1 <- str_c(rlang::as_name(agg_col),
map_chr(rlang::call_args(rlang::enexpr(stat_fun)),
rlang::as_name), sep="_")
data %>%
group_by(!!! groups) %>%
summarise_at(vars(rlang::as_name(agg_col)), stat_fun) %>%
rename_at(vars(starts_with('fn')), ~ nm1)
}
-测试
custom_aggregation(mtcars, stat_fun = list(sum), agg_col = qsec, cyl, am) # A tibble: 6 x 3
# Groups: cyl [3]
# cyl am qsec
# <dbl> <dbl> <dbl>
#1 4 0 62.9
#2 4 1 148.
#3 6 0 76.9
#4 6 1 49.0
#5 8 0 206.
#6 8 1 29.1
custom_aggregation(mtcars, stat_fun = list(sum, max), agg_col = qsec, cyl, am)
# A tibble: 6 x 4
# Groups: cyl [3]
# cyl am qsec_sum qsec_max
# <dbl> <dbl> <dbl> <dbl>
#1 4 0 62.9 22.9
#2 4 1 148. 19.9
#3 6 0 76.9 20.2
#4 6 1 49.0 17.0
#5 8 0 206. 18
#6 8 1 29.1 14.6
custom_aggregation(mtcars, stat_fun = list(sum, min, max), agg_col = qsec, cyl, am)
# A tibble: 6 x 5
# Groups: cyl [3]
# cyl am qsec_sum qsec_min qsec_max
# <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 0 62.9 20 22.9
#2 4 1 148. 16.7 19.9
#3 6 0 76.9 18.3 20.2
#4 6 1 49.0 15.5 17.0
#5 8 0 206. 15.4 18
#6 8 1 29.1 14.5 14.6