我有一个函数,可以使用dplyr
来基于一些用户定义的组来汇总变量:
library(tidyverse)
get_var_summary <- function(.data, .target_var, .group_vars = vars()) {
.target_var = enquo(.target_var)
return(
.data %>%
filter(!is.na(!! .target_var)) %>%
group_by_at(.vars = .group_vars) %>%
summarize(
mean = mean(!! .target_var),
sd = sd(!! .target_var),
ci = qnorm(0.975) * sd(!! .target_var) / sqrt(n()),
median = median(!! .target_var),
n = n()
) %>%
mutate(
sd = ifelse(is.na(sd), Inf, sd),
ci = ifelse(is.na(ci), Inf, ci)
) %>%
ungroup()
)
}
mtcars %>%
get_var_summary(wt, .group_vars = vars(cyl))
返回:
# A tibble: 3 x 6
cyl mean sd ci median n
<dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 4. 2.29 0.570 0.337 2.20 11
2 6. 3.12 0.356 0.264 3.22 7
3 8. 4.00 0.759 0.398 3.76 14
现在,为了能够轻松地重复.group_vars
,但偶尔还要提供另一个分组变量,我想定义另一个调用get_var_summary
的函数,但还要增加一个附加列到.group_vars
:
get_var_summary_by_another <- function(.data, .extra_var, .target_var, .group_vars = vars()) {
# how do I add .extra_var to .group_vars?
}
我该怎么做?
答案 0 :(得分:2)
想法是先将.group_vars
与!!!
拼接起来,然后将.extra_var
添加到新的vars()
调用中:
get_var_summary_by_another <- function(.data, .extra_var, .target_var, .group_vars = vars()) {
.extra_var = enquo(.extra_var)
.target_var = enquo(.target_var)
.group_vars = vars(!!! .group_vars, !! .extra_var)
return(
.data %>% get_var_summary(
!! .target_var,
.group_vars
)
)
}
mtcars %>%
get_var_summary_by_another(gear, .target_var = wt, .group_vars = vars(cyl))
返回:
# A tibble: 8 x 7
cyl gear mean sd ci median n
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 4. 3. 2.46 Inf Inf 2.46 1
2 4. 4. 2.38 0.601 0.416 2.26 8
3 4. 5. 1.83 0.443 0.614 1.83 2
4 6. 3. 3.34 0.173 0.240 3.34 2
5 6. 4. 3.09 0.413 0.405 3.16 4
6 6. 5. 2.77 Inf Inf 2.77 1
7 8. 3. 4.10 0.768 0.435 3.81 12
8 8. 5. 3.37 0.283 0.392 3.37 2